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(57) Abstract 

Receptor recognition factors exist that recognize the specific cell receptor to which a specific ligand has been bound, and that may 
thereby signal and/or initiate the binding of the transcription factor to the DNA site. The receptor recognition factor is in one instance, 
a part of a transcription factor, and also may interact with other transcription factors to cause them to activate and travel to the nucleus 
for DNA binding. The receptor recognition factor appears to be second-messenger-independent in its activity, as overt perturbations in 
second messenger concentrations are of no effect The concept of the invention is illustrated by the results of studies conducted with 
interferon (IFN) -stimulated gene transcription, and particularly, the activation caused by both IFNa and IFN7 Specific DNA and amino 
acid sequences for various human and murine receptor recognition factors are provided, as are polypeptide fragments of two of the ISGF-3 
genes, and antibodies have also been prepared and tested. The polypeptides confirm direct involvement of tyrosine kinase in intracellular 
message transmission. Numerous diagnostic and therapeutic materials and utilities are also disclosed. 
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RECEPTOR RECOGNITION FACTORS, PROTEIN SEQUENCES 
AND METHODS OF USE THEREOF 

pp ATFn PURI.1CATIONS 

5 

The Applicants are authors or co-authors of several articles directed to the subject 
matter of the present invention. (1) Darnell et al.,"Interferon-Dependent 
Transcriptional Activation: Signal Transduction Without Second Messenger 
Involvement?" THE NEW BIOLOGIST, 2(10):l-4, (1990); (2) X. Fu et al., 
10 "ISGF3, The Transcriptional Activator Induced by Interferon a, Consists of 
Multiple Interacting Polypeptide Chains" PROC. NATL. ACAD. SCI. USA, 
87:8555-8559 (1990); (3) D.S. Kessler et al., "IFNa Regulates Nuclear 
Translocation and DNA-Binding Affinity of ISGF3, A Multimeric Transcriptional 
Activator" GENES AND DEVELOPMENT, 4:1753 (1990); (4) C. Schindler et al., 
15 "Interferon-Dependent Tyrosine Phosphorylation of a Latent Cytoplasmic 
Transcription Factor" Science, 257:809-812 (1992); (5) Ke Shuai et al., 
"Interferon-7 triggers transcription through cytoplasmic tyrosine phosphorylation 
of a 91 kD DNA binding protein" Science, 258:1808 (1992); and (6) International 
Patent Publication No. WO 93/19179, "I FN RECEPTORS RECOGNITION 
20 FACTORS, PROTEIN SEQUENCES AND METHODS OF USE THEREOF," 
published 30 September 1993. 

TFPHNTP Al . FIELD riF THF. INVENTION 

25 The present invention relates generally to intracellular receptor recognition 
protems or factors (i.e. groups of proteins), and to methods and compositions 
mcludmg such factors or the antibodies reactive toward them, or analogs thereof in 
assays and for diagnosing, preventing and/or treating cellular debilitation, 
derangement or dysfunction. More particularly, the present invention relates to 

30 particular molecules that exhibit both receptor recognition and message delivery . 
via DNA binding in an interferon-dependent manner, and specifically that directly 
participate both in the interaction with the liganded receptor at the cell surface and 
in the activity of transcription in the nucleus as a DNA binding protein. The 
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invention likewise relates to the antibodies and other entities that are specific to 
this factor and that would thereby selectively modulate its activity. 



BACKGROUND OF THE INVENTION 

5 

There are several possible pathways of signal transduction that might be followed 
after a polypeptide ligand binds to its cognate cell surface receptor. Within 
minutes of such Iigand-receptor interaction, genes that were previously quiescent 
are rapidly transcribed (Murdoch et al., 1982; Larner et al., 1984; Friedman et 

10 ah, 1984; Greenberg and Ziff, 1984; Greenberg et ah, 1985). One of the most 
physiologically important, yet poorly understood, aspects of these immediate 
transcriptional responses is their specificity: the set of genes activated, for 
example, by platelet-derived growth factor (PDGF), does not completely overlap 
with the one activated by nerve growth factor (NGF) or tumor necrosis factor 

15 (TNF) (Cochran et ah, 1983; Greenberg et al., 1985; Almendral et al., 1988; Lee 
et al., 1990). The interferons (IFN) activate sets of other genes entirely. Even 
IFNcx and IFN7, whose presence results in the slowing of cell growth and in an 
increased resistance to viruses (Tamm et al., 1987) do not activate exactly the 
same set of genes (Larner et al., 1984; Friedman et al., 1984; Celis et al., 1987, 

20 1985; Larner et al., 1986). 



The current hypotheses related to signal transduction pathways in the cytoplasm do 
not adequately explain the high degree of specificity observed in polypeptide- 
dependent transcriptional responses. The most commonly discussed pathways of 

25 signal transduction that might ultimately lead to the nucleus depend on properties 
of cell surface receptors containing tyrosine kinase domains [for example, PDGF, 
epidermal growth factor (EGF), colony-stimulating factor (CSF), insulin-like 
growth factor- 1 (IGF-1); see Gill, 1990; Hunter, 1990) or of receptors that interact 
with G-proteins (Gilman, 1987). These two groups of receptors mediate changes. 

30 in the intracellular concentrations of second messengers that, in turn, activate one 
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of a series of protein phosphokinases, resulting in a cascade of phosphorylations 
(or dephosphorylations) of cytoplasmic proteins. 

It has been widely conjectured that the cascade of phosphorylations secondary to 
5 changes in intracellular second messenger levels is responsible for variations in the 
rates of transcription of particular genes (Bourne, 1988, 1990; Berridge, 1987; 
Gill, 1990; Hunter, 1990). However, there are at least two reasons to question 
the suggestion that global changes in second messengers participate in the chain of 
events leading to specific transcriptional responses dependent on specific receptor 
10 occupation by polypeptide ligands. 

First, there is a limited number of second messengers (cAMP, diacyl glycerol, 
phosphoinositides, and Ca 2+ are the most prominently discussed), whereas the 
number of known cell surface receptor-ligand pairs of only the tyrosine kinase and 

15 G-protein varieties, for example, already greatly outnumbers the list of second 
messengers, and could easily stretch into the hundreds (Gill, 1990; Hunter, 1990). 
In addition, since many different receptors can coexist on one cell type at any 
instant, a cell can be called upon to respond simultaneously to two or more 
different ligands with an individually specific transcriptional response each 

20 involving a different set of target genes. Second, a number of receptors for 

polypeptide ligands are now known that have neither tyrosine kinase domains nor 
any structure suggesting interact.on with G-proteins. These include the receptors 
for interleukin-2 (IL-2) (Leonard et al., 1985), IFNor (Uze et al., 1990), 1FN 7 
(Aguet et al., 1988). NGF (Johnson et al., 1986). and growth hormone (Leung et 

25 al., 1987). The binding of each of these receptors to its specific ligand has been 
demonstrated to stimulate transcription of a specific set of genes. For these 
reasons it seems unlikely that global intracellular fluctuations in a limited set of 
second messengers are integral to the pathway of specific, polypeptide ligand- 
dependeni, immediate transcriptional responses. 



30 
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International Patent Publication No. WO 93/19179 (30 September 1993, by James 
E. Darnell Jr. et al.) disclosed the existence of receptor recognition factors, now 
termed signal transducers and activators of transcription (STAT). The nucleotide 
sequences of cDNA encoding receptor recognition factors having molecular 

5 weights of 113 kD (i.e., 113 kD protein, Stat 113, or Stat2), 91 kD (i.e., 91 kD 
protein, Stat91, or Stat la) and 84 kD (/.*., 84 kD protein, Stat84, or Statl/3) are 
reiterated herein in SEQ ID NOS:l f 3, and 5, respectively; the corresponding 
deduced amino acid sequences of the STAT proteins are shown in SEQ ID NOS:2, 
4, and 6, respectively. Stat84 was found to be a truncated form of Stat91. There 

10 is 42% amino acid sequence similarity between Stat 113 and Stat91/84 in an 

overlapping 715 amino acid sequence, including four leucine and one valine heptad 
repeats in the middle helix region, and several tyrosine residues were conserved 
near the ends of both proteins. The receptor recognition proteins thus possess 
multiple properties, among them: 1) recognizing and being activated during such 

15 recognition by receptors; 2) being translocated to the nucleus by an inhibitable 
process (e.g., NaF inhibits translocation); and 3) combining with transcription 
activating proteins or acting themselves as transcription activation proteins, and 
that all of these properties are possessed by the proteins described herein. In 
particular, the proteins are activated by binding of interferons to receptors on 

20 cells, in particular interferon-a (all three Stat proteins) and interferon-7 (Stat91). 

SUMMARY OF THE INVENTION 

In accordance with the present invention, additional members of the family of 
25 receptor recognition factors (also termed herein signal transducers and activators 
of transcription - STAT) have been further characterized that appear to interact 
directly with receptors that have been occupied by their ligand on cellular 
surfaces, and which in turn either become active transcription factors, or activate 
or directly associate with transcription factors that enter die cells' nucleus and 
30 specifically binds on predetermined sites and thereby activates the genes. It should 
be noted that the receptor recognition proteins thus possess multiple properties, 
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among them: 1) recognizing and being activated during such recognition by 
receptors; 2) being translocated to the nucleus by an inhibitable process (eg. NaF 
inhibits translocation); and 3) combining with transcription activating proteins or 
acting themselves as transcription activation proteins, and that all of these 
5 properties are possessed by the proteins described herein. 

A further property of the receptor recognition factors is dimerization to form 
homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infra, Stat91 and Stat84 form homodimers and a Stat91- 
10 Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phosphorylation of the STAT protein, or which 
can be prepared synthetically by chemically cross-linking two like or unlike STAT 
proteins. 

15 The present invention further relates to receptor recognition factors that are 

functionally active fragments of the 91 kD receptor recognition factor, particularly 
such fragments that contain an amino acid residue corresponding to the tyrosine 
701 residue, and preferably that contain a corresponding phosphotyrosine residue. 
In a different embodiment, the functionally active fragments further comprises the 

20 SH2 domain, particularly the SH2 domain that has a residue corresponding to an 
arginine-602 residue. It is envisioned that such functionally active receptor 
recognition factors comprise at least about 8 amino acid residues. 

The invention contemplates inhibitory fragments of the 91 kD protein. In one 
25 embodiment, the SH2 domain of the 91 kD protein can competitively inhibit 

phosphorylation of the whole protein or fragment thereof containing tyrosine 701. 
In another embodiment, an inhibitory fragment can compete with the 91 kD 
protein for binding to a tyrosine kinase. Such an inhibitory fragment may contain 
a residue corresponding to tyrosine 701 . 

30 
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The receptor recognition factor is proteinaceous in composition and is believed to 
be present in the cytoplasm. The recognition factor is not demonstrably affected 
by concentrations of second messengers, however does exhibit direct interaction 
with tyrosine kinase domains, although it exhibits no apparent interaction with G- 
5 proteins. More particularly, the 91 kD human interferon (IFN)- 7 factor (hence, 
formerly also termed "GAF"), represented by SEQ ID NO:4 directly interacts with 
DNA after acquiring phosphate on tyrosine located at position 701 of the ammo 
acid sequence. 

10 The recognition factor is now known to comprise several proteinaceous 

substituents, in the instance of IFNa and IFN 7 . Three proteins derived from the 
factor ISGF-3 have been successfully sequenced and their sequences are set forth 
in SEQ ID NOS:l, 2; SEQ ID NOS:3, 4; and SEQ ID NOS:5, 6, herein (see 
International Patent Publication No. WO 93/19179). The present invention is 

15 therefore particularly directed to additional members of the STAT family, 

including a murine gene encoding die 91 kD protein (SEQ ID NO:4) has been 
identified and sequenced. The nucleotide sequence (SEQ ID NO:7) and deduced 
amino acid sequence (SEQ ID NO:8) of the murine homolog of SEQ ID NO:4 are 
shown in FIGURE 1A-1C. 

20 

In a further embodiment, murine genes encoding homologs of the recognition 
factor have been successfully sequenced and cloned into plasmids. A gene in 
plasmid 13sfl has the nucleotide sequence (SEQ ID NO:9) and deduced amino 
acid sequence (SEQ ID NO: 10) as shown in FIGURE 2A-D. A gene in plasmid 
25 19sf6 has the nucleotide sequence (SEQ ID NO: 1 1) and deduced amino acid 
sequence (SEQ ID NO: 12) shown in FIGURE 3A-E. 

It is particularly noteworthy that the protein sequence of SEQ ID NO:2 and the 
sequence of the proteins of SEQ ID NO:4 and SEQ ID NO:6 derive, respectively, 
30 from two different but related genes. Moreover, the protein sequence of FIGURE 
1 (SEQ ID NO:8) derives from a murine gene that is analogous to the gene 
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encoding the protein of SEQ ID NO:4. Of further note is that the protein 
sequences of FIGURES 2 (SEQ ID NO: 10) and 3 (SEQ ID NO: 12) derive from 
two genes that are different from, but related to, the protein of FIGURE 1 (FIG 
ID NO:8). It is clear from these discoveries that a family of genes exists, and that 
5 further family members likewise exist. Accordingly, as demonstrated herein, by 
use of hybridization techniques, additional such family members will be found. 



Further, the capacity of such family members to function in the manner of the 
receptor recognition factors disclosed, herein may be assessed by determining 
10 those ligand that cause the phosphorylation of the particular family members. 

In its broadest aspect, the present invention extends to a receptor recognition 
factor implicated in the transcriptional stimulation of genes in target cells in 
response to the binding of a specific polypeptide ligand to its cellular receptor on 
15 said target cell, said receptor recognition factor having the following 
characteristics: 

a) apparent direct interaction with the ligand-bound receptor complex 
and activation of one or more transcription factors capable of binding with a 
specific gene; 

20 b) an activity demonstrably unaffected by the presence or concentration 

of second messengers; 

c) direct interaction with tyrosine kinase domains; and 

d) a perceived absence of interaction with G-proteins. 

25 In a further aspect, the receptor recognition (STAT) protein forms a dimcr upon 
activation by phosphorylation. 



In a specific example, the receptor recognition factor represented by SEQ ID 
NO:4 possesses the added capability of acting as a translation protein and, in 
30 particular, as a DNA binding protein in response to interferon-7 stimulation. This 
discovery presages an expanded role for the proteins in question, and other 
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proteins and like factors that have heretofore been characterized as receptor 
recognition factors. It is therefore apparent that a single factor may indeed 
provide the nexus between the liganded receptor at the cell surface and direct 
participation in DNA transcriptional activity in the nucleus. This pleiotypic factor 
5 has the following characteristics: 

a) It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substrate; and 

c) When phosphorylated, it serves as a DNA binding protein. 

10 More particularly, the factor represented by SEQ ID NO:4 is interferon-dependent 
in its activity and is responsive to interferon stimulation, particularly that of 
interferon-7. It has further been discovered that activation of the factor 
represented by SEQ ID NO:4 requires phosphorylation of tyrosine-701 of the 
protein. In particular, phosphorylation of tyrosine-701 is required for nuclear 

15 transport, DNA binding, and transcription activation. Furthermore, tyrosine 
phosphorylation requires the presence of a functionally active SH2 domain in the 
protein. Preferably, such SH2 domain contains an amino acid residue 
corresponding to an arginine at position 602 of the protein. 

20 In a still further aspect, the present invention extends to a receptor recognition 
factor interactive with a liganded interferon receptor, which receptor recognition 
factor possesses the following characteristics: 

a) it is present in cytoplasm; 

b) it undergoes tyrosine phosphorylation upon treatment of cells with IFNa 
25 or IFN7; 

c) it activates transcription of an interferon stimulated gene; 

d) it stimulates either an ISRE-dependent or a gamma activated site 
(GAS)-dependent transcription in vivo; 

e) it interacts with IFN cellular receptors, and 

30 f) it undergoes nuclear translocation upon stimulation of the IFN cellular 

receptors with IFN. 
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The factor of the invention represented by SEQ ID NO:4 appears to act in similar 
fashion to an earlier determined site-specific DNA binding protein that is 
interferon-y dependent and that has been earlier called the y activating factor 
(GAF). Specifically, interferon-7-dependent activation of this factor occurs 
5 without new protein synthesis and appears within minutes of interferon-7 

treatment, achieves maximum extent between 15 and 30 minutes thereafter, and 
then disappears after 2-3 hours. These further characteristics of identification and 
action assist in the evaluation of the present factor for applications having both 
diagnostic and therapeutic significance. 

10 

In a particular embodiment, the present invention relates to all members of the 
herein disclosed family of receptor recognition factors, specifically the proteins 
whose sequences are represented by one or more of SEQ ID NO:8, SEQ ID 
NO: 10, or SEQ ID NO: 12. 

15 

The present invention also relates to a recombinant DNA molecule or cloned gene, 
or a degenerate variant thereof, which encodes a receptor recognition factor, or a 
fragment thereof, that possesses a molecular weight of about 1 13 kD and an amino 
acid sequence set forth in FIGURE 1 (SEQ ID NO:8). In yet another 

20 embodiment, the receptor recognition factor has an amino acid sequence set forth 
in FIGURE 2 (SEQ ID NO: 10); preferably a nucleic acid molecule, in particular a 
recombinant DNA molecule or cloned gene, encoding such receptor recognition 
factor has a nucleotide sequence or is complementary to a DNA sequence shown 
in FIGURE 2 (SEQ ID NO:9). In still another embodiment, the receptor 

25 recognition factor has an amino acid sequence set forth in FIGURE 3 (SEQ ID 
NO: 12); preferably a nucleic acid molecule, in particular a recombinant DNA 
molecule or cloned gene, encoding such receptor recognition factor has a 
nucleotide sequence or is complementary to a DNA sequence shown in FIGURE 3 
(SEQ ID NO:ll). 
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The human and murine DNA sequences of the receptor recognition factors of the 
present invention or portions thereof, may be prepared as probes to screen for 
complementary sequences and genomic clones in the same or alternate species. 
The present invention extends to probes so prepared that may be provided for 
5 screening cDNA and genomic libraries for the receptor recognition factors. For 
example, the probes may be prepared with a variety of known vectors, such as the 
phage X vector. The present invention also includes the preparation of plasmids 
including such vectors, and die use of the DNA sequences to construct vectors 
expressing antisense RNA or ribozymes which would attack the mRNAs of any or 
10 all of the DNA sequences set forth in FIGURES 1, 2, and 3 (SEQ ID NOS:7, 9, 
and 11, respectively). Correspondingly, the preparation of antisense RNA and 
ribozymes are included herein. 

The present invention also includes receptor recognition factor proteins having the 
15 activities noted herein, and that display the amino acid sequences set forth and 
described above and selected from SEQ ID NO:8, SEQ ID NO: 10 and SEQ ID 
NO: 12. 

In a further embodiment of the invention, the full DNA sequence of the 
20 recombinant DNA molecule or cloned gene so determined may be operative!)' 
linked to an expression control sequence which may be introduced into an 
appropriate host. The invention accordingly extends to unicellular hosts 
transformed with the cloned gene or recombinant DNA molecule comprising a 
DNA sequence encoding the present receptor recognition factor(s), and more 
25 particularly, the complete DNA sequence determined from the sequences set forth 
above and in SEQ ID NO:7, SEQ ID NO:9 and SEQ ID NO: 11. 

According to other preferred features of certain preferred embodiments of the 
present invention, a recombinant expression system is provided to produce 
30 biologically active animal or human receptor recognition factor. 
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The present invention naturally contemplates several means for preparation of the 
recognition factor, including as illustrated herein known recombinant techniques, 
and the invention is accordingly intended to cover such synthetic preparations 
within its scope. The isolation of the cDNA amino acid sequences disclosed 
5 herein facilitates the reproduction of the recognition factor by such recombinant 
techniques, and accordingly, the invention extends to expression vectors prepared 
from the disclosed DNA sequences for expression in host systems by recombinant 
DNA techniques, and to the resulting transformed hosts. 

10 The invention includes an assay system for screening of potential drugs effective to 
modulate transcriptional activity of target mammalian cells by interrupting or 
potentiating the recognition factor or factors. In one instance, the test drug could 
be administered to a cellular sample with the ligand that activates the receptor 
recognition factor, or an extract containing the activated recognition factor, to 

15 determine its effect upon the binding activity of the recognition factor to any 
chemical sample (including DNA), or to the test drug, by comparison with a 
control. 

The assay system could more importantly be adapted to identify drugs or other 
20 entit.es that are capable of binding to the receptor recognition and/or transcription 
factors or proteins, either in the cytoplasm or in the nucleus, thereby inhibiting or 
potentiating transcriptional activity. Such assay would be useful in the 
development of drugs that would be specific against particular cellular activity, or 
that would potentiate such activity, in time or in level of activity. For example, 
25 such drugs might be used to modulate cellular response to shock, or to treat other 
pathologies, as for example, in making IFN more potent against cancer. 

in yet a further embodiment, the invention contemplates antagonists of the activity 
of a receptor recognition factor (STAT). In particular, an agent or molecule that 
30 inhibits dimension (homodimenzation or heterodimenzation) can be used to 
block transcription activation effected by an activated, phosphorylated STAT 
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protein. In a specific embodiment, the antagonist can be a peptide having the 
sequence of a portion of an SH2 domain of a STAT protein, or the 
phosphotyrosine domain of a STAT protein, or both. If the peptide contains both 
regions, preferably the regions are located in tandem, more preferably with the 
5 SH2 domain portion N-terminal to the phosphotyrosine portion. In a specific 
example, infra, such peptides are shown to be capable of disrupting dimerization 
of STAT proteins. 



The diagnostic utility of the present invention extends to the use of the present 
10 receptor recognition factors in assays to screen for tyrosine kinase inhibitors. 
Because the activity of the receptor recognition-transcriptional activation proteins 
described herein must maintain tyrosine phosphorylation, they can and presumably 
are dephosphorylated by specific tyrosine phosphatases. Blocking of the specific 
phosphatase is therefore an avenue of pharmacological intervention that would 
15 potentiate the activity of the receptor recognition proteins. 

The present invention likewise extends to the development of antibodies against the 
receptor recognition factor(s), including naturally raised and recombinantly 
prepared antibodies. For example, the antibodies could be used to screen 

20 expression libraries to obtain the gene or genes that encode the receptor 
recognition factor(s). Such antibodies could include both polyclonal and 
monoclonal antibodies prepared by known genetic techniques, as well as bi- 
specific (chimeric) antibodies, and antibodies including other functionalities suiting 
them for additional diagnostic use conjunctive with their capability of modulating 

25 transcriptional activity. 

In particular, antibodies against specifically phosphorylated factors can be selected 
and are included within the scope of the present invention for their particular 
ability in following activated protein. Thus, activity of the recognition factors or 
30 of the specific polypeptides believed to be causally connected thereto may 

therefore be followed directly by the assay techniques discussed later on, through 
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the use of an appropriately labeled quantity of the recognition factor or antibodies 
or analogs thereof. 

Thus, the receptor recognition factors, their analogs and/or analogs, and any 
5 antagonists or antibodies that may be raised thereto, are capable of use in 

connection with various diagnostic techniques, including immunoassays, such as a 
radioimmunoassay, using for example, an antibody to the receptor recognition 
factor that has been labeled by either radioactive addition, reduction with sodium 
borohydride, or radioiodination. 

10 

In a further embodiment, the present invention relates to certain therapeutic 
methods which would be based upon the activity of the recognition factor(s), its 
(or their) subunits, or active fragments thereof, or upon agents or other drugs 
determined to possess the same activity. A first therapeutic method is associated 

15 with the prevention of the manifestations of conditions causally related to or 
following from the binding activity of the recognition factor or its subunits, and 
comprises administering an agent capable of modulating the production and/or 
activity of the recognition factor or subunits thereof, either individually or in 
mixture with each other in an amount effective to prevent the development of 

20 those conditions in the host. For example, drugs or other binding partners to the 
receptor recognition/transcription factors or proteins may be administered to 
inhibit or potentiate transcriptional activity, as in the potentiation of interferon in 
cancer therapy. Also, the blockade of the action of specific tyrosine phosphatases 
in the dephosphorylation of activated (phosphorylated) recognition/transcription 

25 factors or proteins presents a method for potentiating the activity of the receptor 
recognition factor or protein that would concomitantly potentiate therapies based 
on receptor recognition factor/protein activation. 



30 



More specifically, the therapeutic method generally referred to herein could 
include the method for the treatment of various pathologies or other cellular 
dysfunctions and derangements by the administration of pharmaceutical 
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compositions that may comprise effective inhibitors or enhancers of activation of 
the recognition factor or its subunits, or other equally effective drugs developed 
for instance by a drug screening assay prepared and used in accordance with a 
further aspect of the present invention. For example, drugs or other binding 

5 partners to the receptor recognition/transcription factor or proteins, as represented 
by SEQ ID NO:8, 10, or 12, may be administered to inhibit or potentiate 
transcriptional activity, as in the potentiation of interferon in cancer therapy. 
Also, the blockade of the action of specific tyrosine phosphatases in the 
dephosphorylation of activated (phosphorylated) recognition/transcription factor or 

10 protein presents a method for potentiating the activity of the receptor recognition 
factor or protein that would concomitantly potentiate therapies based on receptor 
recognition factor/protein activation. Correspondingly, the inhibition or blockade 
of the activation or binding of the recognition/transcription factor would affect 
MHC Class II expression and consequently, would promote immunosuppression. 

15 Materials exhibiting this activity, as illustrated later on herein by staurosporine, 
may be useful in instances such as the treatment of autoimmune diseases and graft 
rejection, where a degree of immunosuppression is desirable. 

In particular, the proteins of ISGF-3 whose sequences are presented in SEQ ID 
20 NOS:8, 10, or 12 herein, their antibodies, agonists, antagonists, or active 
fragments thereof, could be prepared in pharmaceutical formulations for 
administration in instances wherein interferon therapy is appropriate, such as to 
treat chronic viral hepatitis, hairy cell leukemia, and for use of interferon in 
adjuvant therapy. The specificity of the receptor proteins hereof would make it 
25 possible to better manage the aftereffects of current interferon therapy, and would 
thereby make it possible to apply interferon as a general antiviral agent. 

Accordingly, it is a principal object of the present invention to provide a novel 
member of the family of receptor recognition factors, and subunits of such a novel 
30 receptor recognition factor, in purified form that exhibits certain characteristics 
and activities associated with transcriptional promotion of cellular activity. 



k <*«»*'* t * * t ■ 



WO95/08629 PCT/US94/10849 

15 

Is a particular object of the invention to provide fragments of such receptor 
recognition factors that inhibit activities of the factors. 

It is a further object of the present invention to provide antibodies to the receptor 
5 recognition factor and its subunits, and methods for their preparation, including 
recombinant means. 

It is a further object of the present invention to provide a method for detecting the 
presence of the receptor recognition factor and its subunits in mammals in which 
10 invasive, spontaneous, or idiopathic pathological states are suspected to be present. 

It is a further object of the present invention to provide a method and associated 
assay system for screening substances such as drugs, agents and the like, 
potentially effective in either mimicking the activity or combating the adverse 
15 effects of the recognition factor and/or its subunits in mammals. 

It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
subunits thereof, so as to alter the adverse consequences of such presence or 
20 activity, or where beneficial, to enhance such activity. 

It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
its subunits, so as to treat or avert the adverse consequences of invasive, 
25 spontaneous or idiopathic pathological states. 

It is a still further object of the present invention to provide pharmaceutical 
compositions for use in therapeutic methods which comprise or are based upon the 
recognition factor, its subunits. their binding partner(s), or upon agents or drugs 
30 that control the production, or that mimic or antagonize the activities of the 
recognition factors. 
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Other objects and advantages will become apparent to those skilled in the art from 
a review of the ensuing description which proceeds with reference to the following 
illustrative drawings. 

ppIF r nPQPPTPTlON OF THE DR AWINGS 

FIGURE 1 depicts (A) the deduced amino acid sequence (SEQ ID NO:8) of and 
(B-C) the DNA sequence (SEQ ID NO:7) encoding the murine 91 kD intracellular 

receptor recognition factor. 

FIGURE 2 depicts (A) the deduced amino acid sequence (SEQ ID NO: 10) of and 
(B-D) the DNA sequence (SEQ ID NO:9) encoding the 13sfl intracellular receptor 

recognition factor. 

15 FIGURE 3 depicts (A) the deduced am.no acid sequence (SEQ ID NO: 12) of and 
(B-E) the DNA sequence (SEQ ID NO:ll) encoding the 19sf6 intracellular 
receptor recognition factor. 

FIGURE 4 presents identification of the phosphotyrosine residue in the 91 kd 
20 protein. (A) Tryptic phosphopept.de map of »P-91 kD protein from IFN- 7 -treated 
FS2 cells Phosphoam.no acid analysis indicated that only peptide X contains 
phosphotyrosme (31). (B) Edman degradation of pept.de X (32). The position of 
the PTH-P-Tyr marker detected by ultraviolet light is indicated. (C) Schematic 
diagram showing the sue of the phosphotyrosine residue in the 91 kD protein. 
25 HR heptapeptide repeat; SH2, Src homology domain 2; and SH3, Src homology 
domain 3. (D) The synthetic peptide LDGPKGTGYIKTEL1 (SEQ ID NO:13), 
which was phosphorylated with "P-labeled tyrosine, was digested with trypsin and 
analyzed by 2D peptide mapping either alone (left panel) or mixed with the same 
amount of "P-labeled peptide X (right panel). Ori, origin. The synthetic pept.de 
30 (10 M8) (obtained from Genet.es) was incubated with 1 U of p45"" (Oncogene 
Science), in 50 mM Hepes (pH 7.4), 0.1 mM EDTA, 0.015% Brij 35, 0.1 mM 
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ATP, 10 mM MgCl 2 and 2 pCl of [ 7 - 32 P]ATP for 30 min. at 30°C. The 32 P- 
labeled peptide was subjected to electrophoresis at pH 3.5 on a thin layer 
chromatography plate and purified. Tryptic digestion of 32 P-labeled peptide was 
done as described (32). 

5 

Human FS2 cells were labeled with [ 32 P]orthophosphate (Du Pont) for 3 hours in 
phosphate-free medium and subsequently treated with IFN-7 for 10 min. Cell 
lysates were immunoprecipitated with antiserum to the COOH-terminal 35 amino 
acids of 91 kD (anti-91T) and separated by SDA-polyacrylamide gel 

10 electrophoresis (PAGE) (7% gel). The 32 P-labeled 91 kD band was excised and 
subjected to tryptic mapping (31). Edman degradation was done as described (32, 
33) with minor modifications. Peptide X (600 counts per minute) was taken 
through five cycles of Edman degradation. Samples from each cycle and an 
equivalent amount of untreated peptide X were analyzed by electrophoresis at pH 

15 3.5. The PTH-P-Tyr marker was synthesized as described (31). 

FIGURE 5 presents an analysis of phosphorylation of the 91 and 84 kD proteins in 
established cell lines. (A) Protein immunoblot analysis with antiserum to the 91 
kD protein (anti-91) of whole-cell extracts from parental 2fTGH cells (lanes 1 and 

20 4); mutant U3 cells lacking the 91 and 84 kD proteins (lanes 2 and 3); U3 cells 
expressing the 91 kD protein (C91, lane 6), the 84 kD protein (C84, lane 7), or 
the Tyr 7 " 1 mutant MNC-ty (Cty, lane 5). (B) Tryptic peptide map of the 84 kD 
protein. C84 cells were labeled with [ 32 P]orthophosphate for 3 hours and then 
treated with I FN -7 for 10 min.; immunoprecipitation with anti-91 and tryptic 

25 peptide mapping of the 32 P-labeled 84-kDa protein was done as described 

(FIGURE 4). (C) Proteins in whole-cell lysates from 2fTGH (lanes 3, 4, 7 and 8) 
and Cty (lanes 1, 2, 5 and 6) cells were immunoprecipitated with anti-91T (31) 
and separated by SDA-PAGE (7% gel). The blot was then probed with a mAb to 
phosphotyrosine 4G10 (UBI. lanes 1 through 4). The blot was stripped and 

30 reprobed with anti-9IT (lanes 5 through 8). U3A cells (5 x 10 5 ) (30) were 
transfected with 4 ng of expression vector and 16 ^g of pBSK (Strategene) 
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plasmid by the calcium phosphate procedure (35). Cells were selected in 
Dulbecco's modified Eagle's medium containing G418 (0.5 mg/ml) (Gibco, BRL). 
48 hours after transfection, individual colonies were screened for the expression of 
appropriate proteins by protein immunoblotting. Cell lines were maintained in the 

5 presence of G418 (0.2 mg/ml). Expression vectors using the cytomegalovirus 
promoter and encoding the 91 or 84 kD protein were constructed by insertion of 
the cDNA into the Notl-BamHl cloning site of an expression vector pMNC (35). 
The TAT codon for Tyr 701 was changed to TTT by standard mutagenesis 
procedure with die polymerase chain reaction (PCR) (36, 37). The sequence was 

10 verified by DNA sequencing (U.S. Biochemical, Cleveland, Ohio). Molecular 
sizes are indicated to the left (A) or to the right (C) in kilodaltons. 

FIGURE 6 presents data relating to DNA binding and nuclear localization of the 
91 and 84 kD proteins. (A) DNA binding and translocation to the nucleus of the 

15 91 and 84 kD proteins. Gel mobility-shift analysis of whole-cell extracts from Cty 
(lanes 1 and 2), 2fTGH (lanes 3 and 4), U3A (lanes 5 and 6), C91 (lanes 7 and 
8), and C84 (lanes 9 and 10) cells treated with IFN-7 for 15 min (+) or untreated 
(-). A 21 nucleotide oligomer containing the GAS sequence from the Ly-6E gene 
(34) was labeled and used as a probe for shift assays as described (31). (B) 

20 Nuclear localization tested by immunofluorescence. Cells from stable cell lines 
C91 (a and b), C84 (c and d), and Cty (e and f) were stained with anti-91T (a,b,e, 
and f) and anti-91 (c and d) as described (31). Untreated, a, c, and e; IFN-7 for 
30 min, b, d, and f. 

25 FIGURE 7 presents an analysis of transcriptional activation. An oligonucleotide 
corresponding to the herpes simplex virus thymidine kinase (TK) promoter from - 
35 to +10 was fused to the Hindlll site of pZLUC, a luciferase reporter construct 
(TK-LUC). One copy of the 91 kD binding site [a 21 nucleotide oligomer from 
the Ly-6E gene (34)] was inserted into the BamHl cloning site of TK-LUC (GAS- 

30 LUC). U3 cells were transfected by the calcium phosphate method as described 
(FIGURE 5) with 4 fig of each construct. The cells were also transfected with 4 
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fig of pMNC alone (35) (MNC) or pMNC encoding the 91 kD protein (MNC-91) 
or the 84 kD protein (MNC-84) or the Tyr 701 mutant of the 91 kD protein (MNC- 
ty). Lane 1, MNC-91 + GAS-LUC; lane 2, MNC-84 + GAS-LUC; lane 3, 
MNC + GAS-LUC; lane 4, MNC-ty + GAS-LUC; lane 4 MNC-91 + TK-LUC: 
5 and lane 6, GAS-LUC. Relative transfection efficiencies were monitored by 
inclusion of a 8-galactosidase expression plasmid (pCMVfl, Promega). Then, 36 
hours after transfection, cells were treated with IFN-7 (5 ng/ml) for 6 hours, 
collected, and assayed for luciferase activities (Promega). (A) Data shown are 
taken from one representative experiment and represent the relative luciferase 
10 activity in cells treated with IFN-7 as compared with that from untreated cells 

(arbitrarily set to 1 U). (The luciferase assay was corrected for relative expression 
of a B-galactosidase). Each transfection was independently repeated at least three 
times. (B) Cell lysates from these same transfections were analyzed for the 
expression of proteins by protein immunoblotting with anti-91. 

15 

FIGURE 8 demonstrates that R 602 in the 91 kD protein SH2 domain is required for 
tyrosine phosphorylation, a) Western blot analysis of whole cell extracts from 
mutant U3A cell line (lane 3); parental 2fTGH cell line (lane 4); or U3A-derived 
cell lines transfected with an expression vector containing an R 602 - > Leu 602 

20 mutation (lanes 1 and 2). Antibody used was anti-91, which recognizes both the 
91 and 84 kD proteins (15, 31). b) Immunoprecipitates with anti-91T antibody 
were subjected to 7.5% SDS-PAGE and probed with an anti-phosphotyrosine 
antibody. Mutagenesis was carried out by standard PCR procedure. The CGG 
codon for Arg 602 was mutated to CTG, which encodes Leu. Transfection and 

25 selection of stable cell lines was described in Figures 4-7 and Examples 3 and 4. 

FIGURE 9. Determination of molecular weights of Stat91 and phospho Stat91 by 
native gel analysis. A) Western blot analysis of fractions from affinity 
purification. Extracts from human FS2 fibroblasts treated with IFN-7 (Ext), the 
30 unbound fraction (Flow), the fraction washed with Buffer AO.2 (AO. 2), and the 
bound fraction eluted with buffer A0.8(A0.8) were immunoblotted with anti-91T. 
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B) Native gel analysis. Phosphorylated Stat91 (the AO.8 fraction from A) and 
unphosphorylated Stat91 (the Flow fraction from A) were analyzed on 4.5%. 
5.5%, 6.5% and 7.5% native polyacrylamide gels followed by immunoblotting 
with anti-91T. The top of gels (TOP) and the migration position of bromophenol 

5 blue (BPB) are indicated. C) Ferguson plots. The relative mobilities (Rm) of the 
Stat91 and phospho Stat91 were obtained from Figure IB (see Experimental 
Procedures). Closed circle: Chicken egg albumin (*** 45kD); Cross: Bovine 
serum albumin, monomer (66 kD); Open square: Bovine serum albumin, dimer 
(132 kD); Open circle: Urease, trimer (272 kD); Open triangle: Unphosphorylated 

10 Stat91; Closed triangle: Phosphorylated Stat91. D) Determination of molecular 
weights from the standard curve. The molecular weights of phosphorylated and 
unphosphorylated Stat91 proteins (indicated as closed and open arrows, 
respectively) were obtained by extrapolation of their retardation coefficients. 

15 FIGURE 10. Determination of molecular weights by glycerol gradients. 

A) Western blot analysis. Extracts from human Bud8 fibroblasts treated with IFN- 
7 (the rightmost lane) and every other fraction from fraction 16 to 34 were 
analyzed on 7.5% SDS-PAGE followed by immunoblotting with anti-91T. The 
peak of phosphorylated Stat91 (fraction 20) and the peak of unphosphorylated 

20 Stat91 (fraction 30) were indicated by a closed and open arrow, respectively. 

B) Mobility shift analysis. Every other fractions from the gradients were 
analyzed. C) Graphic representation of the data from A and B. Peak fraction 
numbers of protein standards are plotted versus their molecular weight. The 
position of peaks (of phosphorylated and unphosphorylated Stat91 protein are 

25 indicated by the closed and open arrows, respectively. Standards are ferritin (Fer, 
440 kD), catalase (Cat, 232 kD), ferritin half unit (Fer 1/2, 220 kD), aldolase 
(Aid, 158 kD). bovine serum albumin (BSA, 68 kD). 

FIGURE 11. Stat91 in cell extracts binds DNA as a dimer. A) Western blot 
30 analysis. Extracts from stable cell lines expressing either Stat84 (C84), or Stat91L 
(C91E) or both (Cmx) were analyzed on 7.5% SDS-PAGE followed by 
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immunoblotting with anti-91. B) Gel mobility shift analysis. Extracts from stable 
cell lines (Fig 3A) untreated (-) or treated with IFN-7(+) were analyzed. The 
positions of Stat91 homodimer (91 L), Stat84 homodimer (84), and the heterodimer 
(84*91) are indicated. 

5 

FIGURE 12. Formation of heterodimer by denaturation and renaturation. 
Cytoplasmic (Left Panel) or nuclear extracts (Right Panel) from IFN-7-treated cell 
lines expressing either Stat84 (C84) or Stat91 (C91) were analyzed by gel mobility 
shift assays. +: with addition; -: without addition; D/R: samples were subjected 
10 to guanidinium hydrochloride denaturation and renaturation treatment. 

FIGURE 13. Diagrammatic representation of dissociation and reassociation 
analysis. 

15 FIGURE 14. Dissociation-reassociation analysis with peptides. Gel mobility shift 
analysis with IFN-y treated nuclear extracts from cell lines expressing Stat91L 
(C91L, lane 15) or Stat84 (C84, lane 14) or mixture of both (lane 1-13, 16-18) in 
the presence of increasing concentrations of various peptides. 91-Y, 
unphosphorylated peptide from Stat91 (LDGPKGTGYIKTELI) (SEQ ID NO: 15); 

20 91 Y-p, phosphotyrosyl peptide from Stat91 (GY*IKTE) (SEQ ID NO: 16); 1 13Y- 
p, phosphotyrosyl peptide with high binding affinity to Src SH2 domain 
(EPQY*EEIPIYL, Songyang et al., 1993, Cell 72:767-778) (SEQ ID NO: 18). 
Final concentrations of peptides added: 1 (iM (lane 8), 4 (lane 2,5, 11), 10 
jxM (lane 9). 40 fiM (lane 3, 6, 10, 12, 14-18), 160 fiM (lane 4, 7, 13). +: with 

25 addition; -: without addition. Right panel: antiserum tests for identity of gel-shift 
bands. 

FIGURE 15. Dissociation-reassociation analysis with GST fusion proteins. A) 
SDS-PAGE (12%) analysis of purified GST fusion proteins as visualized by 
30 Commasie blue. GST-91 SH3, native SH2 domain of Stat91; GST-91 mSH2, R 61 ' 2 
to L M ' 2 mutant; GST-91 SH3, SH3 domain of Stat91; GST Src SH2, the SH2 
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domain of src protein. Same amounts (1 fxg) of each fusion proteins were loaded. 
Protein markers were run in lane 1 as indicated. B) Dissociation-reassociation 
analysis. Dissociating agents were GST fusion proteins purified from bacterial 
expression as shown above. Final concentrations of fusion proteins added are 0.5 
5 (lanes 2, 5, 8, 11, 14), 2.5 /xM (lanes 3, 6, 9, 12, 15) and 5 /zM (lanes 4, 7, 
10, 13, 17, 18). +: with addition; -: without addition; FP: fusion proteins. 

FIGURE 16. Comparison of Stat91 SH2 structure with known SH2 structures. 
The Stat91 sequence is disclosed herein (SEQ ID NO:4). The structures used for 
10 the other SH2s are Src (Waksman et al., 1992, Nature 358:646-653) (SEQ ID 

NO:19), Abl (Overduin et al., 1992, Proc. Natl. Acad. Sci. USA 89:11673-77 and 

1992, Cell 70:697-704) (SEQ ID NO:20, Lck (Eck et al., 1993. Nature 362:87- 
91) (SEQ ID NO:21), and p85aN (Booker et al., 1992, Nature 358:684-687) 
(SEQ ID NO:22). The alignment of the determined structures is by direct 

15 coordinate superimposition of the backbone structures. The names of secondary 
structural features and significant residues is based on the scheme of Eck et al., 

1993. The boundaries and extents of the structure features are indicated by [ — ]. 
The starting numbers for the parent sequences are shown in parentheses. 
Experimentally determined structurally conserved regions are from Src, p85a, and 

20 Abl (Cowburn, unpublished). The root mean square deviation of three- 

dimensionally aligned structures differs by less than 1 Angstrom for the backbone 
non-hydrogen atoms in the sections marked by the XXX. 

DETAILED DESCRIPTION 

25 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the 
skill of the art. Such techniques are explained fully in die literature. See, e.g., 
Maniatis, Fritsch & Sambrook, "Molecular Cloning: A Laboratory Manual" 
30 (1982); "DNA Cloning: A Practical Approach," Volumes I and II (D.N. Glover 
ed. 1985); "Oligonucleotide Synthesis" (M.J. Gait ed. 1984); "Nucleic Acid 
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Hybridization" [B.D. Hames & S.J. Higgins eds. (1985)]; "Transcription And 
Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" 
[R.I. Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [IRL Press, 
(1986)]; B. Perbal, "A Practical Guide To Molecular Cloning" (1984). 

5 

Therefore, if appearing herein, the following terms shall have the definitions set 
out below. 

The terms "receptor recognition factor", "receptor recognition-tyrosine kinase 

10 factor", "receptor recognition factor/tyrosine kinase substrate", "receptor 
recognition/transcription factor", "recognition factor", "recognition factor 
protein(s)\ "signal transducers and activators of transcription", "STAT", and any 
variants not specifically listed, may be used herein interchangeably, and as used 
throughout the present application and claims refer to proteinaceous material 

15 including single or multiple proteins, and extends to those proteins having the 
amino acid sequence data described herein and presented in FIGURE 1 (SEQ ID 
NO:8), FIGURE 2 (SEQ ID NO: 10), and in FIGURE 3 (SEQ ID NO: 12), and the 
profile of activities set forth herein and in the Claims. Accordingly, proteins 
displaying substantially equivalent or altered activity are likewise contemplated. 

20 These modifications may be deliberate, for example, such as modifications 

obtained through site-directed mutagenesis, or may be accidental, such as those 
obtained through mutations in hosts that are producers of the complex or its named 
subunits. Also, the terms "receptor recognition factor", "recognition factor", 
"recognition factor protein(s)\ "signal transducers and activators of transcription", 

25 and "STAT" are intended to include within their scope proteins specifically recited 
herein as well as all substantially homologous analogs and allelic variations. 

The amino acid residues described herein are preferred to be in the "L" isomeric 
form. However, residues in the "D" isomeric form can be substituted for any L 
30 amino acid residue, as long as the desired functional property of immunoglobulin- 
binding is retained by the polypeptide. NH2 refers to the free amino group 
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present at the amino terminus of a polypeptide. COOH refers to the free carboxy 
group present at the carboxy terminus of a polypeptide. In keeping with standard 
polypeptide nomenclature, J. Biol. Chem., 243:3552-59 (1969), abbreviations for 
amino acid residues are shown in the following Table of Correspondence: 

5 

TABLE OF rORRFSPONDENCE 

SYMBOL AMINO ACID 

1 -Letter 3-Letter 



Y 


Tyr 


tyrosine 


G 


Gly 


glycine 


F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


1 


He 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


threonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


His 


histidine 


Q 


Gin 


glutamine 


E 


Glu 


glutamic acid 


W 


Trp 


tryptophan 


R 


Arg 


arginine 


D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


C 


Cys 


cysteine 



30 It should be noted that all amino-acid residue sequences are represented herein by 
formulae whose left and right orientation is in the conventional direction of amino- 
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terminus to carboxy-terminus. Furthermore, it should be noted that a dash at the 
beginning or end of an amino acid residue sequence indicates a peptide bond to a 
further sequence of one or more amino-acid residues. The above Table is 
presented to correlate the three-letter and one-letter notations which may appear 
5 alternately herein. 



A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, 
guanine, thymine, or cytosine) in its either single stranded form, or a double- 
stranded helix. This term refers only to the primary and secondary structure of 

10 the molecule, and does not limit it to any particular tertiary forms. Thus, this 
term includes double-stranded DNA found, inter alia, in linear DNA molecules 
(e.g., restriction fragments), viruses, plasmids, and chromosomes. In discussing 
the structure of particular double-stranded DNA molecules, sequences may be 
described herein according to the normal convention of giving only the sequence in 

15 the 5' to y direction along the nontranscribed strand of DNA (i.e., the strand 
having a sequence homologous to the mRNA). 

A DNA "coding sequence" is a double-stranded DNA sequence which is 
transcribed and translated into a polypeptide in vivo when placed under the control 

20 of appropriate regulatory sequences. The boundaries of the coding sequence are 
determined by a start codon at the 5' (amino) terminus and a translation stop 
codon at the 3' (carboxyl) terminus. A coding sequence can include, but is not 
limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA 
sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA 

25 sequences. A polyadenylation signal and transcription termination sequence will 
usually be located 3' to the coding sequence. 

Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, terminators, and the like, 
30 that provide for the expression of a coding sequence in a host cell. A "promoter 
sequence" is a DNA regulatory region capable of binding RNA polymerase in a 
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cell and initiating transcription of a downstream (3' direction) coding sequence. 
For purposes of defining the present invention, the promoter sequence is bounded 
at its 3 1 terminus by the transcription initiation site and extends upstream (5' 
direction) to include the minimum number of bases or elements necessary to 
5 initiate transcription at levels detectable above background. Within the promoter 
sequence will be found a transcription initiation site (conveniently defined by 
mapping with nuclease SI), as well as protein binding domains (consensus 
sequences) responsible for the binding of RNA polymerase. Eukaryotic promoters 
will often, but not always, contain "TATA" boxes and "CAT" boxes. Prokaryotic 

10 promoters contain Shine-Dalgarno sequences in addition to the -10 and -35 

consensus sequences. An "expression control sequence" is a DNA sequence that 
controls and regulates the transcription and translation of another DNA sequence, 
e.g., and enhancer or suppressor element. A coding sequence is "under the 
control" of transcriptional and translational control sequences in a cell when RNA 

15 polymerase transcribes the coding sequence into mRNA, which is then translated 
into the protein encoded by the coding sequence. 

A DNA sequence is "operatively linked" to an expression control sequence when 
the expression control sequence controls and regulates the transcription and 

20 translation of that DNA sequence. The term "operatively linked" includes having 
an appropriate start signal (e.g., ATG) in front of the DNA sequence to be 
expressed and maintaining the correct reading frame to permit expression of the 
DNA sequence under the control of the expression control sequence and 
production of the desired product encoded by the DNA sequence. If a gene that 

25 one desires to insert into a recombinant DNA molecule does not contain an 
appropriate start signal, such a start signal can be inserted in front of the gene. 

The term "standard hybridization conditions" refers to salt and temperature 
conditions substantially equivalent to 5 x SSC and 65°C for both hybridization and 
30 wash. 
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A "signal sequence" can be included before the coding sequence. This sequence 
encodes a signal peptide, N-terminal to the polypeptide, that communicates to the 
host cell to direct die polypeptide to the cell surface or secrete the polypeptide into 
the media, and this signal peptide is clipped off by the host cell before the protein 
5 leaves the cell. Signal sequences can be found associated with a variety of 
proteins native to prokaryotes and eukaryotes. 

The term "oligonucleotide", as used herein in referring to the probe of the present 
invention, is defined as a molecule comprised of two or more ribonucleotides, 
10 preferably more than three. Its exact size will depend upon many factors which, 
in turn, depend upon the ultimate function and use of the oligonucleotide. 

The term "primer" as used herein refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is 

15 capable of acting as a point of initiation of synthesis when placed under conditions 
in which synthesis of a primer extension product, which is complementary to a 
nucleic acid strand, is induced, i.e., in the presence of nucleotides and an inducing 
agent such as a DNA polymerase and at a suitable temperature and pH. The 
primer may be either single-stranded or double-stranded and must be sufficiently 

20 long to prime the synthesis of the desired extension product in the presence of the 
inducing agent. The exact length of the primer will depend upon many factors, 
including temperature, source of primer and use of the method. For example, for 
diagnostic applications, depending on the complexity of the target sequence, the 
oligonucleotide primer typically contains 15-25 or more nucleotides, although it 

25 may contain fewer nucleotides. 

The primers herein are selected to be "substantially" complementary to different 
strands of a particular target DNA sequence. This means that the primers must be 
sufficiently complementary to hybridize with their respective strands. Therefore, . 
30 the primer sequence need not reflect the exact sequence of the template. For 

example, a non-complementary nucleotide fragment may be attached to the 5 1 end 
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of the primer, with the remainder of the primer sequence being complementary to 
the strand. Alternatively, non-complementary bases or longer sequences can be 
interspersed into the primer, provided that the primer sequence has sufficient 
complementarity with the sequence of the strand to hybridize therewith and 
5 thereby form the template for the synthesis of the extension product. 

A cell has been "transformed" by exogenous or heterologous DNA when such 
DNA has been introduced inside die cell. The transforming DNA may or may not 
be integrated (covalently linked) into chromosomal DNA making up the genome of 

10 the cell. In prokaryotes, yeast, and mammalian cells for example, the 

transforming DNA may be maintained on an episomal element such as a plasmid. 
With respect to eukaryotic cells, a stably transformed cell is one in which the 
transforming DNA has become integrated into a chromosome so that it is inherited 
by daughter cells through chromosome replication. This stability is demonstrated 

15 by the ability of the eukaryotic cell to establish cell lines or clones comprised of a 
population of daughter cells containing the transforming DNA. A "clone" is a 
population of cells derived from a single cell or common ancestor by mitosis. A 
"cell line" is a clone of a primary cell that is capable of stable growth in vitro for 
many generations. 

20 

Two DNA sequences are "substantially homologous" when at least about 75% 
(preferably at least about 80%, and most preferably at least about 90 or 95%) of 
the nucleotides match over the defined length of the DNA sequences. Sequences 
that are substantially homologous can be identified by comparing the sequences 
25 using standard software available in sequence data banks, or in a Southern 

hybridization experiment under, for example, stringent conditions as defined for 
that particular system. Defining appropriate hybridization conditions is within the 
skill of the art. See, e.g., Maniatis et al., supra; DNA Cloning, Vols. I & II, 
supra; Nucleic Acid Hybridization, supra. 

30 
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A "heterologous" region of the DNA construct is an identifiable segment of DNA 
within a larger DNA molecule that is not found in association with the larger 
molecule in nature. Thus, when the heterologous region encodes a mammalian 
gene, the gene will usually be flanked by DNA that does not flank the mammalian 

5 genomic DNA in the genome of the source organism. Another example of a 

heterologous coding sequence is a construct where the coding sequence itself is not 
found in nature (e.g., a cDNA where the genomic coding sequence contains 
introns, or synthetic sequences having codons different than the native gene). 
Allelic variations or naturally-occurring mutational events do not give rise to a 

10 heterologous region of DNA as defined herein. 



An "antibody" is any immunoglobulin, including antibodies and fragments thereof, 
that binds a specific epitope. The term encompasses polyclonal, monoclonal, and 
chimeric antibodies, the last mentioned described in further detail in U.S. Patent 

15 Nos. 4,816,397 and 4,816,567. An "antibody combining site* is that structural 
portion of an antibody molecule comprised of heavy and light chain variable and 
hypervariable regions that specifically binds antigen. The phrase "antibody 
molecule" in its various grammatical forms as used herein contemplates both an 
intact immunoglobulin molecule and an immunologically active portion of an 

20 immunoglobulin molecule. Exemplary antibody molecules are intact 

immunoglobulin molecules, substantially intact immunoglobulin molecules and 
those portions of an immunoglobulin molecule that contains the paratope, including 
those portions known in the art as Fab, Fab 7 , F(ab') 2 and F(v), which portions are 
preferred for use in the therapeutic methods described herein. The phrase 

25 "monoclonal antibody" in its various grammatical forms refers to an antibody 
having only one species of antibody combining site capable of immunoreacting 
with a particular antigen. A monoclonal antibody thus typically displays a single 
binding affinity for any antigen with which it immunoreacts. A monoclonal 
antibody may therefore contain an antibody molecule having a plurality of 

30 antibody combining sites, each immunospecific for a different antigen; e.g., a 
bispecific (chimeric) monoclonal antibody. 
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The phrase "pharmaceutical ly acceptable" refers to molecular entities and 
compositions that are physiologically tolerable and do not typically produce an 
allergic or similar untoward reaction, such as gastric upset, dizziness and the like, 
when administered to a human. Preferably, as used herein, the term 
5 "pharmaceutical^ acceptable" means approved by a regulatory agency of the 
Federal or a state government or listed in the U.S. Pharmacopeia or other 
generally recognized pharmacopeia for use in animals, and more particularly in 
humans. 

10 The phrase "therapeutically effective amount" is used herein to mean an amount 
sufficient to prevent, and preferably reduce by at least about 30 percent, more 
preferably by at least 50 percent, most preferably by at least 90 percent, a 
clinically significant change in the S phase activity of a target cellular mass, or 
other feature of pathology such as for example, elevated blood pressure, fever or 

15 white cell count as may attend its presence and activity. 

In its primary aspect, the present invention concerns the identification of novel 
receptor recognition factors, and the isolation and sequencing of a particular 
receptor recognition factor proteins, that are believed to be present in cytoplasm 

20 and that serves as a signal transducer between a particular cellular receptor having 
bound thereto an equally specific polypeptide ligand, and die comparably specific 
transcription factor that enters the nucleus of the cell and interacts with a specific 
DNA binding site for the activation of the gene to promote the predetermined 
response to the particular polypeptide stimulus. The present disclosure confirms 

25 that specific and individual receptor recognition factors exist that correspond to 
known stimuli such as tumor necrosis factor, nerve growth factor, platelet-derived 
growth factor and the like. Specific evidence of this is set forth herein with 
respect to the interferons a and 7 (IFNor and IFN7). 

30 A further property of the receptor recognition factors (also termed herein signal 
transducers and activators of transcription STAT) is dimerization to form 
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homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infra, Stat91 and Stat84 form homodimers and a Stat91- 
Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phosphorylation of the STAT protein, or which 
5 can be prepared synthetically by chemically cross-linking two like or unlike STAT 
proteins. 

The present receptor recognition factor is likewise noteworthy in that it appears 
not to be demonstrably affected by fluctuations in second messenger activity and 
10 concentration. The receptor recognition factor proteins appear to act as a substrate 
for tyrosine kinase domains, however do not appear to interact with G-proteins, 
and therefore do not appear to be second messengers. 

A particular receptor recognition factor identified herein by SEQ ID NO: 4, Stat91 
15 or Static*, has been determined to be present in cytoplasm and serves as a signal 
transducer and a specific transcription factor in response to IFN-7 stimulation that 
enters the nucleus of the cell and interacts directly with a specific DNA binding 
site for the activation of the gene to promote the predetermined response to the 
particular polypeptide stimulus. This particular factor also acts as a translation 
20 protein and, m particular, as a DNA binding protein in response to interferon-7 
stimulation. This factor is likewise noteworthy in that it has the following 
characteristics: 

a) It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substrate; and 

25 c) When phosphorylated, it serves as a DNA binding protein. 

More particularly, the factor of SEQ ID NO:4 directly interacts with DNA after 
acquiring phosphate on tyrosine located at position 701 of the amino acid 
sequence. Also, interferon-7-dependent activation of this factor occurs without 
30 new protein synthesis and appears within minutes of interferon-7 treatment, 
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achieves maximum extent between 15 and 30 minutes thereafter, and then 
disappears after 2-3 hours. 

Stat 91 is more particularly characterized by at least one of the following 

5 additional characteristics: 

d) Phosphorylation of tyrosine-701 is required for nuclear transport; 

e) Phosphorylation of tyrosine-701 is required for DNA binding; 

f) Phosphorylation of tyrosine-701 is required for transcription 
activation; 

10 g ) A functional SH2 domain is required for tyrosine-701 

phosphorylation. 

Yet a furdier property of the present factor is its ability to dimerize when 
phosphorylated. Accordingly, a further property of the receptor recognition 

15 factors (also termed herein signal transducers and activators of transcription ~ 
STAT) is dimerization to form homodimers or heterodimers upon activation by 
phosphorylation of tyrosine. In a specific embodiment, infra, Stat91 and Stat84 
form homodimers and a Stat91-Stat84 heterodimer. Accordingly, the present 
invention is directed to such dimers. which can form spontaneously by 

20 phosphorylation of the STAT protein, or which can be prepared synthetically by 
chemically cross-linking two like or unlike STAT proteins. 

The present invention further relates to receptor recognition factors that are 
functionally active fragments, e.g.. as exemplified herein with fragments of the 91 

25 kD receptor recognition factor, particularly such fragments that contain an amino 
acid residue corresponding to the tyrosine 701 residue, and preferably that contain 
a corresponding phosphotyrosine residue. In a different embodiment, the 
functionally active fragments further comprises the SH2 domain, particularly the 
SH2 domain that has a residue corresponding to an arginine-602 residue of the 91- 

30 kD receptor recognition factor. It is envisioned that such functionally active 
receptor recognition factors comprise at least about 8 amino acid residues. 
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The invention contemplates inhibitory fragments of such receptor recognition 
proteins, e.g., as exemplified herein with respect to the 91 kD protein. In one 
embodiment, the SH2 domain of the 91 kD protein can competitively inhibit 
phosphorylation of the whole protein or fragment thereof containing tyrosine 701. 
5 In another embodiment, an inhibitory fragment can compete with the 91 kD 

protein for binding to a tyrosine kinase. Such an inhibitory fragment may contain 
a residue corresponding to tyrosine 701 . 

In yet a further embodiment, the invention contemplates antagonists of the activity 
10 of a receptor recognition factor (STAT). In particular, an agent or molecule that 
inhibits dimenzation (homodimerization or heterodimerization) can be used to 
block transcription activation effected by an activated, phosphorylated STAT 
protein. In a specific embodiment, the antagonist can be a peptide having the 
sequence of a portion of an SH2 domain of a STAT protein, or the 
15 phosphotyrosine domain of a STAT protein, or both. If the peptide contains both 
regions, preferably the regions are located in tandem, more preferably with the 
SH2 domain portion N-terminal to the phosphotyrosine portion. In a specific 
example, infra, such peptides are shown to be capable of disrupting dimerization 
of STAT proteins. 

20 

Subsequent to the filing of the initial applications directed to the present invention, 
the inventors have termed each member of the family of receptor recognition 
factors as a signal transducer and activator of transcription (STAT) protein. Each 
STAT protein is designated by the apparent molecular weight (e.g., Statl 13, 

25 Stat91, Stat84. etc.), or by the order in which it has been identified (e.g.. Statin 
|Stat91], Statl 0 [Stat84] ( Stat2 [Statl 13], Stat3 [a murine protein also termed 
19sf6],, and Stat4 [a murine STAT protein also termed 13sfl]). As will be readily 
appreciated by one of ordinary skill in the art, the choice of name has no effect on 
the intrinsic characteristics of the factors described herein, which were first 

30 disclosed in International Patent Publication No. WO 93/19179, published 30 
September 1993. The present inventors have chosen to adopt this newly derived 
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receptor that is occupied by its ligand. and those factors that thereafter directly 
interface with the gene and effect transcription and accordingly gene activation. 
As suggested earlier and elaborated further on herein, the present invention 
contemplates pharmaceutical intervention in the cascade of reactions in which the 
5 receptor recognition factor is implicated, to modulate the activity initiated by the 
stimulus bound to the cellular receptor. 

Thus, in instances where it is desired to reduce or inhibit the gene activity 
resulting from a particular stimulus or factor, an appropriate inhibitor of the 

10 receptor recognition factor could be introduced to block the interaction of the 
receptor recognition factor with those factors causally connected with gene 
activation. Correspondingly, instances where insufficient gene activation is taking 
place could be remedied by the introduction of additional quantities of the receptor 
recognition factor or its chemical or pharmaceutical cognates, analogs, fragments 

15 and the like. 

As discussed earlier, the recognition factors or their binding partners or other 
ligands or agents exhibiting either mimicry or antagonism to the recognition 
factors or control over their production, may be prepared in pharmaceutical 

20 compositions, with a suitable carrier and at a strength effective for administration 
by various means to a patient experiencing an adverse medical condition associated 
specific transcriptional stimulation for the treatment thereof. A variety of 
administrative techniques may be utilized, among them parenteral techniques such 
as subcutaneous, intravenous and intraperitoneal injections, catheterizations and the 

25 like. Average quantities of the recognition factors or their subunits may vary and 
in particular should be based upon the recommendations and prescription of a 
qualified physician or veterinarian. 

Also, antibodies including both polyclonal and monoclonal antibodies, and drugs 
30 that modulate the production or activity of the recognition factors and/or their 
subunits may possess certain diagnostic applications and may for example, be 
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utilized for the purpose of detecting and/or measuring conditions such as viral 
infection or the like. For example, the recognition factor or its subunits may be 
used to produce both polyclonal and monoclonal antibodies to themselves in a 
variety of cellular media, by such well known techniques as immunization of 
5 rabbit using Complete and Incomplete Freund's Adjuvant and the hybridoma 
technique utilizing, for example, fused mouse spleen lymphocytes and myeloma 
cells respectively. These techniques have been described in numerous 
publications in great detail, e.g., International Patent Publication WO 93/19179, 
and do not bear repeating here. 



10 



Likewise, small molecules that mimic or antagonize the activity(ies) of the 
receptor recognition factors of the invention may be discovered or synthesized, 
and may be used in diagnostic and/or therapeutic protocols. 

15 As suggested earlier, the diagnostic method of the present invention comprises 
examining a cellular sample or medium by means of an assay including an 
effective amount of an antagonist to a receptor recognition factor/protein, such as 
an anti-recognition factor antibody, preferably an affinity-purified polyclonal 
antibody, and more preferably a mAb. In addition, it is preferable for the anti- 
20 recognition factor antibody molecules used herein be in the form of Fab, Fab' , 
F(ab') 2 or F(v) portions or whole antibody molecules. As previously discussed, 
patients capable of benefiting from this method include those suffering from 
cancer, a pre-cancerous lesion, a viral infection or other like pathological 
derangement. Methods for isolating the recognition factor and inducing anti- 
25 recognition factor antibodies and for determining and optimizing the ability of anti- 
recognition factor antibodies to assist in the examination of the target cells are all 
well-known in the art. 

The present invention further contemplates therapeutic compositions useful in 
30 practicing the therapeutic methods of this invention. A subject therapeutic 
composition includes, in admixture, a pharmaceutical^ acceptable excipient 
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(carrier) and one or more of a receptor recognition factor, polypeptide analog 
thereof or fragment thereof, as described herein as an active ingredient. In a 
preferred embodiment, the composition comprises an antigen capable of 
modulating the specific binding of the present recognition factor within a target 
5 cell. 

The preparation of therapeutic compositions which contain polypeptides, analogs 
or active fragments as active ingredients is well understood in the art. Typically, 
such compositions are prepared as injectables, either as liquid solutions or 

10 suspensions, however, solid forms suitable for solution in, or suspension in, liquid 
prior to injection can also be prepared. The preparation can also be emulsified. 
The active therapeutic ingredient is often mixed with excipients which are 
pharmaceutical ly acceptable and compatible with the active ingredient. Suitable 
excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like 

15 and combinations thereof. In addition, if desired, the composition can contain 
minor amounts of auxiliary substances such as wetting or emulsifying agents, pH 
buffering agents which enhance the effectiveness of the active ingredient. 

A polypeptide, analog or active fragment can be formulated into the therapeutic 
20 composition as neutralized pharmaceutical^ acceptable salt forms. 

Pharmaceutical^ acceptable salts include the acid addition salts (formed with the 
free amino groups of the polypeptide or antibody molecule) and which are formed 
with inorganic acids such as, for example, hydrochloric or phosphoric acids, or 
such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed 
25 from the free carboxyl groups can also be derived from inorganic bases such as, 
for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and 
such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, 
histidine, procaine, and the like. 

30 The therapeutic polypeptide-, analog- or active fragment-containing compositions 
are conventionally administered intravenously, as by injection of a unit dose, for 
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example. The term "unit dose" when used in reference to a therapeutic 
composition of the present invention refers to physically discrete units suitable as 
unitary dosage for humans, each unit containing a predetermined quantity of actrve 
material calculated to produce the desired therapeutic effect in association with the 
5 required diluent; i.e., carrier, or vehicle. 

The compositions are administered in a manner compatible with the dosage 
formulation, and in a therapeutically effective amount. The quantity to be 
administered depends on the subject to be treated, capacity of the subject's 
10 immune system to utilize the active ingredient, and degree of inhibition or 

neutralization of recognition factor binding capacity desired. Precise amounts of 
active ingredient required to be administered depend on the judgment of the 
practitioner and are peculiar to each individual. However, suitable dosages may 
range from about 0.1 to 20, preferably about 0.5 to about 10, and more preferably 

15 one to several, milligrams of active ingredient per kilogram body weight of 

individual per day and depend on the route of administration. Suitable regimes for 
initial administration and booster shots are also variable, but are typified by an 
initial administration followed by repeated doses at one or more hour intervals by 
a subsequent injection or other administration. Alternatively, continuous 

20 intravenous infusion sufficient to maintain concentrations of ten nanomolar to ten 
micromolar in the blood are contemplated. 

The therapeutic compositions may further include an effective amount of the 
factor/factor synthesis promoter antagonist or analog thereof, and one or more of 
25 die following act.ve ingredients: an antibiotic, a steroid. Exemplary formulations 
are well known in the art, e.g., as disclosed in International Patent Publication 
WO 93/19179. 

Another feature of this invention is the expression of the DNA sequences disclosed 
30 herein. As is well known in the art, DNA sequences may be expressed by 
operatively linking them to an expression control sequence in an appropriate 
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expression vector and employing that expression vector to transform an 
appropriate unicellular host. 

Such operattve l.nktng of a DNA sequence of this invention to an expression 

fte provision of an initiation codon, ATG, in the correct reading frame upstream 
of the DNA sequence. 

K wide variety of hos./expresston vector combinations may be empioyed in 
10 express me DNA seque.es of this invention. Usefu, expresston vectors, for 
exampie, may cordis, of segment of chromosomai, non-chromosomal and 
Synthetic DNA sequence, Suitable vectors tnclude derivatives of SV40 and 
II bacertal plasmids. e.g.. E. «* piasmids co, EI, P CRi, PBR322. pMB9 
and tbeir derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous 
15 derivatives of phage X. e.g., NM989, and other phage DNA, e.g., M.3 and 

Fiiamentous singie sanded phage DNA; yeast plasmids such as the *. p lasm,d or 
derivatives thereof; vectors useful ,n eukaryotic ce„s, such as vectors usefu! tn 
insect or mammaiian celis; vectors derived from combinations of plasm.ds and 
phage DNAS, such as plasmids that have been modified to empioy phage DNA or 
20 other expression control sequences; and the like. 

Any of a wide variety of expresston contfol sequences - sequences tha, control the 
expression of a DNA sequence op era.tvely .inked to it - may be used tn these 
vectors to express the DNA sequences of thts inventto. Such 
25 controi sequences tnc.ude. for example, the earl, or late promoters of SV40, 
CMV vaccima. polyoma or adenovtrus. the lac system, the rrp system, the TAC 
system the TRC system, the 13* sys E m, the major operator and promoter regtons 
of phage X, the control regtons of fdcoa, protein, the promoter for 

3-phosphog.ycera.e ktnase or other glycolytic enzymes, the promoters o, acd ■ 
30 phosphatase (e.g., Pho5», the promoters of the yeas, o-matmg factors, and other 
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sequences known ,o control the expression of genes of prokaryotic or eukaryouc 
cells or their viruses, and various combinations thereof. 

A wide variety of unicellular hos, ceUs ate also useful in expressing the DNA 
5 seances of ftk — . These hos. may inciude well known eukaryouc and 
prokaryotic hosts, such as strains of E. a*, ft-*-—. *»»»• 
«s, fcngi such as yeas, and an- ce„s, such as CHO, Ri , -W and 
UU ceils. African Green Monkey k,dney ceils (e.g., COS 1, COS 7. BSC 
BSC40, and EMTIO), insect cells (e.g.. SB), and human cells and plant cells m 
10 tissue culture. 

Alternatively, a genes encoding a receptor recognition factor of the invention may 

b e incorporated in a ,ansge„,c expression vecior, , S ., one of the 

retrovrra, vectors, for * *o or » v,Vo transfection of cells for gene therapy. 



„ will be understood that not all vectors, expression control sequences and hosts 

wil, function equally well to express the DNA sequences of this inventton. 

Neither will all hosts function equally wel. with .he same expression system. 

However, one skilled in the ar, will be able to select the proper vectors, 
20 expression control sequences, and hosts without undue experimental to 
accomplish the desired expression wimout departing from ihe scope o tins 
invention. For examp.e. in selecting a vector, the hos, must be considered 
because the vector must function in it. The vector's cop, number, the ab, „ to 
control that copy number, and the expression of any other proteins encoded by ore 
25 vector, such as antibiotic markers, will also be cons,dered. 

selecung an expression control sequence, a variety of factors will normally be 
consrdereo. These include, for example, the relative strength of the system, us 
controHabil.ty. and its compat.bil.ty w,,h the particular DNA sequence or gene to 
30 be expressed, particularly as regards potential secondary structures. Sort* 

umcellular hos. will be selected b, consideration of. e.g.. their compa„b,l,,y w,th 
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the chosen vector, their secretion characteristics, their ability to fold proteins 
correctly, and their fermentation requirements, as well as the toxicity to the host 
of the product encoded by the DNA sequences to be expressed, and the ease of 
purification of the expression products. 

Considering these and other factors a person skilled in the art will be able to 
construct a variety of vector/expression control sequence/host combinations that 
will express the DNA sequences of this invention on fermentation or in large scale 
animal culture. 

It is further intended that receptor recognition factor analogs may be prepared 
from nucleotide sequences of die protein complex/subumt derived within the scope 
of the present invention. Analogs, such as fragments, may be produced, for 
example by pepsin digestion of receptor recognition factor material. Other 
15 analogs, such as muteins, can be produced by standard site-directed mutagenesis of 
receptor recognition factor coding sequences. Analogs exhibiting "receptor 
recognition factor activity" such as small molecules, whether functioning as 
promoters or inhibitors, may be identified by known in vivo and/or in vitro assays. 

20 As mentioned above, a DNA sequence encoding receptor recognition factor can be 
prepared synthetically rather than cloned. The DNA sequence can be designed 
with the appropriate codons for the receptor recognition factor amino acid 
sequence. In general, one will select preferred codons for the intended host if the 
sequence will be used for expression. The complete sequence is assembled from 

25 overlapping oligonucleotides prepared by standard methods and assembled into a 
complete coding sequence. See, e.g.. Edge, Nature, 292:756 (1981); Nambair et 
al.. Science, 223:1299 (1984); Jay et al., J. Biol. Chem., 259:6311 (1984). 

Synthetic DNA sequences allow convenient construction of genes which will 
30 express receptor reaction factor analogs or "muteins". Alternatively, DNA 
encoding mute.ns can be made by site-directed mutagenesis of native receptor 
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recognition factor genes or cDNAs, and muteins can be made directly using 
conventional polypeptide synthesis. 

A general method for site-specific incorporation of unnatural amino acids into 
5 proteins is described in Christopher J. Noren T Spencer J. Anthony-Cahill, Michael 
C. Griffith, Peter G. Schultz, Science, 244:182-188 (April 1989). This method 
may be used to create analogs with unnatural amino acids. 

The present invention extends to the preparation of antisense nucleotides and 
10 ribozymes that may be used to interfere with the expression of the receptor 
recognition proteins at the translational level. This approach utilizes antisense 
nucleic acid and ribozymes to block translation of a specific mRNA, either by 
masking that mRNA with an antisense nucleic acid or cleaving it with a ribozyme. 
Antisense and ribozyme technology are well known in the art, and have been 
15 described in many publications, e.g., International Patent Publication WO 
93/19179. 

The present invention also relates to a variety of diagnostic applications, including 
methods for detecting the presence of stimuli such as the earlier referenced 
20 polypeptide ligands, by reference to their ability to elicit the activities which are 
mediated by the present receptor recognition factor. 

As mentioned earlier, the receptor recognition factor can be used to produce 
antibodies to itself by a variety of known techniques, and such antibodies could 

25 then be isolated and utilized as in tests for the presence of particular transcriptional 
activity in suspect target cells. Many assay procedures, or formats, are well 
known in the art. The "competitive" procedure is described in U.S. Patent Nos. 
3,654,090 and 3,850,752. The "sandwich" procedure, is described in U.S. Patent 
Nos. RE 31,006 and 4,016,043. Still other procedures are known such as the 

30 "double antibody", or "DASP" procedure. In each instance, the receptor 

recognition factor forms complexes with one or more antibody(ies) or binding 
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partners and one member of the complex is labeled with a detectable label. The 
fact that a complex has formed and, if desired, the amount thereof, can be 
determined by known methods applicable to the detection of labels. 

5 The labels most commonly employed for these studies are radioactive elements, 
enzymes, chemicals which fluoresce when exposed to ultraviolet light, and others. 
A number of fluorescent materials are known and can be utilized as labels. These 
include, for example, fluorescein, rhodamine and auramine. The receptor 
recognition factor or its binding partner(s) can also be labeled with a radioactive 

10 element or with an enzyme. The radioactive label can be detected by any of the 
currently available counting procedures. The preferred isotope may be selected 
from 3 H, "C, 32 P, 35 S, *Cl, 31 Cr, "Co, "Co, "Fe, *'Y, 131 I, and '"Re. 
Enzyme labels are likewise useful, and can be detected by any of the presently 
utilized colorimetric, spectrophotometry fluorospectrophotometric, amperometric 

15 or gasometric techniques. 

A particular assay system developed and utilized in accordance with the present 
invention, is known as a receptor assay. In a receptor assay, the material to be 
assayed is appropriately labeled and then certain cellular test colonies are 
20 inoculated with a quantity of both the labeled and unlabeled material after which 
binding studies are conducted to determine the extent to which the labeled material 
binds to the cell receptors. In this way, differences in affinity between materials 
can be ascertained. 

25 Accordingly, a purified quantity of the receptor recognition factor may be 
radiolabeled and combined, for example, with antibodies or other inhibitors 
thereto, after which binding studies would be carried out. Solutions would then be 
prepared that contain various quantities of labeled and unlabeled uncombined 
receptor recognition factor, and cell samples would then be inoculated and 

30 thereafter incubated. The resulting cell monolayers are then washed, solubilized 
and then counted in a gamma counter for a length of time sufficient to yield a 
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standard error of <5%. These data are then subjected to Scatchard analysis after 
which observations and conclusions regarding material activity can be drawn. 
While the foregoing is exemplary, it illustrates the manner in which a receptor 
assay may be performed and utilized, in the instance where the cellular binding 
5 ability of the assayed material may serve as a distinguishing characteristic. 

An assay useful and contemplated in accordance with the present invention is 
known as a "cis/trans" assay. Briefly, this assay employs two genetic constructs, 
one of which is typically a plasmid that continually expresses a particular receptor 

10 of interest when transfected into an appropriate cell line, and the second of which 
is a plasmid that expresses a reporter such as luciferase, under the control of a 
receptor/ligand complex. Thus, for example, if it is desired to evaluate a 
compound as a ligand for a particular receptor, one of the plasmids would be a 
construct that results in expression of the receptor in the chosen cell line, while the 

15 second plasmid would possess a promoter linked to the luciferase gene in which 
the response element to the particular receptor is inserted. If the compound under 
test is an agonist for the receptor, the ligand will complex with the receptor , and 
the resulting complex will bind the response element and initiate transcription of 
the luciferase gene. The resulting chemiluminescence is then measured 

20 photometrically, and dose response curves are obtained and compared to those of 
known ligands. The foregoing protocol is described in detail in U.S. Patent No. 
4,981,784 and PCT International Publication No. WO 88/03168, for which 
purpose the artisan is referred. 

25 In a further embodiment of this invention, commercial test kits suitable for use by 
a medical specialist may be prepared to determine the presence or absence of 
predetermined transcriptional activity or predetermined transcriptional activity 
capability in suspected target cells. In accordance with the testing techniques 
discussed above, one class of such kits will contain at least the labeled receptor 

30 recognition factor or its binding partner, for instance an antibody specific thereto, 
and directions, of course, depending upon the method selected, e.g.. 
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"competitive", "sandwich", "DASP" and the like. The kits may also contain 
peripheral reagents such as buffers, stabilizers, etc. 

Accordingly, a test kit may be prepared for the demonstration of the presence or 
5 capability of cells for predetermined transcriptional activity, comprising: 

(a) a predetermined amount of at least one labeled immunochemical^ reactive 
component obtained by the direct or indiitct attachment of the present receptor 
recognition factor or a specific binding partner thereto, to a detectable label; 

(b) other reagents; and 

10 (c) directions for use of said kit. 

More specifically, the diagnostic test kit may comprise: 

(a) a known amount of the receptor recognition factor as described above (or 
a binding partner) generally bound to a solid phase to form an immunosorbent, or 

15 in the alternative, bound to a suitable tag, or plural such end products, etc. (or 
their binding partners) one of each; 

(b) if necessary, other reagents; and 

(c) directions for use of said test kit. 

20 In a further variation, the test kit may be prepared and used for the purposes stated 

above, which operates according to a predetermined protocol (e.g. "competitive", 

"sandwich", "double antibody", etc.), and comprises: 

(a) a labeled component which has been obtained by coupling the receptor 

recognition factor to a detectable label; 
25 (b) one or more additional immunochemical reagents of which at least one 

reagent is a ligand or an immobilized ligand, which ligand is selected from the 

group consisting of: 

(i) a ligand capable of binding with the labeled component (a); 

(ii) a ligand capable of binding with a binding partner of the labeled 
30 component (a); 
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(iii) a ligand capable of binding with at least one of the component(s) to 

be determined: and 

(iv) a ligand capable of binding with at least 

one of the binding partners of at least one of the component(s) to be determined; 
5 and 

(c) directions for the performance of a protocol for the detection and/or 
determination of one or more components of an immunochemical reaction between 
the receptor recognition factor and a specific binding partner thereto. 

10 In accordance with the above, an assay system for screening potential drugs 
effective to modulate the activity of the receptor recognition factor may be 
prepared. The receptor recognition factor may be introduced into a test system, 
and the prospective drug may also be introduced into the resulting cell culture, and 
the culture thereafter examined to observe any changes in the transcriptional 

15 activity of the cells, due either to die addition of the prospective drug alone, or 
due to the effect of added quantities of the known receptor recognition factor. 

The present invention may be better understood by reference to the following 
Examples, which are provided by way of exemplification and not limitation. 



20 



EXAMPLE 1: IDENTIFICATION OF MURINE 91 KD PROTEIN 



A fragment of the gene encoding the human 91 kD protein was used to screen a 
murine thymus and spleen cDNA library for homologous proteins. The screening 
25 assay yielded a highly homologous gene encoding a murine polypeptide that is 
greater than 95% homologous to the human 91 kD protein. The nucleic acid and 
deduced amino acid sequence of the murine 91 kD protein are shown in Figure 
1A-1C, and SEQ ID NO: 7 (nucleotide sequence) and SEQ ID NO: 8 (amino acid 
sequence). 

30 
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FXAMPLE 2- ADDITIONAL MEMBERS OF 
LXAMrLL . ^ ^ ppnTF]N p^MlLY 

Using a 300 nuclide fragment uplifted by PGR from the SH2 region of the 
5 murine 91kD protein gene, murine genes encoding two additional members of the 
1 13-91 family of receptor recognition factor proteins were isolated from a murme 
splenic/thymic cDNA Horary according to the method of Sambrook et al. (1989, 
Molecular Cloning A Laboratory Manual 2nd. ed., Cold Spring Harbor Press: 
Cold Spring Harbor, New York) constructed in the ZAP vector. Hybridization 
10 was earned out at 42°C and washed at 42°C before the first exposure (Church and 
Gilbert, 1984, Proc. Natl. Acad. Sci. USA 81:1991-95). Then the filters were 
washed in 2X SSC. 0.1 % SDS at 65°C for a second exposure. Stall clones 
survived the 65°C washing, whereas Stat3 and Start clones were identified as 
plaques that lost signals at 65°C. The plaques were purified and subcloned 
15 according to Stratagene commercial protocols. 

This probe was chosen to screen for other STAT family members because, whi.e 
Statl and Stat2 SH2 domains are quite similar over the entire 100 to 120 ammo 
acid region, only the amino terminal half of the STAT SH2 domains strongly 
20 resemble the SH2 regions found in other proteins. 

The two genes have been cloned into plasmids 13sfl and 19sf6. The nucleotide 
sequence, and deduced amino acid sequence, for the 13.fl and 19sf6 genes are 
shown in Figures 2 and 3. respectively. These proteins are alternatively termed 
25 Stat4 and Stat3, respectively. 

repawn with ft. sequence of Sta,9> (Statt) and Sta.113 (Sta«2> *»» ««- 
highly conserved regtons. includ.ng the putative SH3 and SH2 doma.ns. The 
conserved amino acid stretches likely point to conserved domains that ervable these 
30 proteins ,o carry ou, transition activation functions. S«3. like Sutl (StatW, 
is widely expressed, while Start express™ is limited to the testes, thymus, and 
spleen Stat) has been found ,0 be activated as a DNA binding prote.n through 
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, • u crF nr IL-6 but not after IFN-7. 
ph „ s p l ,or y la.iono„ t yros i n=i„^« re a,edw„ 1 ,EGForIL 

treatment. 

5 encoding the human and munne ™ p ins and the 

am ,„o ac« seouences of the I— - -™ * J ^ muriK and 
human 91 kD proteins. Thus, thougn j 

determined: Statl and btat genes ^ Qn 

chromosome 1 (corresponding to human 2q 32-34q). 

15 other chromosomes. 

j f m ncfi and 19sf6 on human genomic 
„ brar ,es have eslab.ished «» genes corresponding to 
genes are found in humans. 

„ „f ,h«e renes was evaluated by Northern 

hybridization analysis. The resui* u 
following Table. 
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DISTRIBUTION OF m RNA EXPRESS^ Of 9> *> PROTEINS 




15 



No * r „ anaiysis den_ - * *° T^"" ° 

™ 1 of 0,e mRNAs encoded by to, g e„es. Tbe va^on «- — 
^ 1 ind.ca.es U,a, ,he spec,«c g e„es encCe pro.,. ,ha, a,e ,esp~ 
!I1 facors as wou.d be expected ,„ accrue wUh d,e p»« ™. 
" U. * «, 6 of indue, pbosp^iaUon „ f *e newly 
20 Ls. w,„ be .ad,,, de,e™,na.e based on »e „ss„e d,s,n but ,on 

evidence described above. 

To define whe,her 0* Su,3 and P-U* were presen, b. ce,.s, p.ouin 
L, uence p>us — s,,es «B«.H. a, ,e 5- end and EcoRl « *e 3 end,. 
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allowing for in-frame fusion with GST. One milligram of each antigen was used 
for the immunization and three booster inject.ons were given 4 weeks apart. Anti- 
Stat3 and anti-Stat4 sera were used 1:1000 in Western blots using standard 
protocols. To avoid cross reactivity of the antisera, antibodies were raised against 
5 the C-terminal of Stat3 and Stat4, die less homologous region of the protein. 

These proteins were unambiguously found in several tissues where the mRNA wan 
known to be present. Protein expression was checked in several cell lines as well. 
A protein of 89 kD reactive with Stat4 antiserum was expressed in 70Z cells, a 
10 preB cell line, but not in many other cell lines. Stat3 was highly expressed, 

predominantly as a 97 kD protein, in 70Z, HT2 (a mouse helper T cell clone), and 
U937 (a macrophage-derived cell). 

To prove that the full length functional cDNA clones of Stat3 and Stat4 were 
15 obtained, the open reading frames of each cDNA was independently (i.e. , 

separately) cloned into the Rc/CMV expression vector (Invitrogen) downstream of 
a CMV promoter. The resulting plasmids were transfected into COS1 cells and 
proteins were extracted 60 hrs post-transfection and examined by Western blot 
after electrophoresis. Umransfected COS1 cells expressed a low level of 97 kD 
20 Stat3 protein but did not express a detectable level of Stat4. Upon transfection of 
the StatS-expressing plasmid, the 97 kD Stat3 was increased at least 10-fold. And 
89 kD protein antigenically related to Stat3. found as a minor band in most cell 
line extracts, was also increased post-transfection. This protein therefore appears 
to represent another form of Stat3 protein, or an antigenically similar protein 
25 whose synthesis is stimulated by Stat3. Transfection with Stat4 led to the 

expression of a 89 kD reactive band indistinguishable in size form the p89 Stat4 
found in 70Z cell extracts. 
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nisciISSION 



As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 

5 investigates with particular stimuli. Particularly, the present disclosure is 
iliustrated by the results of work on protein factors that govern transcriptional 
control of IFN«-stirnulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN 7 . The present disclosure is further 
illustrated by the identification of related genes encoding protein factors response 

10 to as yet unknown factors. It is expected that the murine 91 kD protem ,s 
responsive to IFN-7- 

For example, the above represents evidence that the 91 kD protein is the tyrosine 
kinase target when IFN 7 is the ligand. Thus two different Hgands acting through 
15 two different receptors both use these family members. With only a modest 

number of family members and combinatorial use in response to different hgands, 
this family of proteins becomes an even more likely possibility to represent a 
general link between ligand-occupied receptors and transcriptional control of 
specific genes in the nucleus. 

20 

It is proposed and shown by the foregoing that other members of the 113-91 
protein family will be and have been .dentified as phosphorylation targets in 
response to other hgands. If as is believed, the tyrosine phosphorylation site on 
proteins in this family is conserved, one can then easily determine which family 

25 members are activated (phosphorylated), and likewise the particular extracellular 
polypeptide ligand to which that family member is responding. The modifications 
of these protems (phosphorylation and dephosphorylation) enables the preparation 
and use of assays for determining the effectiveness of pharmaceuticals in 
potentiating or preventing intracellular responses to various polypeptides, and such 

30 assays are accordingly contemplated within the scope of the present invention. 
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Earlier work has concluded that DNA binding protein was activated in the cell 
cytoplasm in response to IFN-7 treatment and that this protein stimulated 
transcription of the GBP gene (10,14). In the present work, with the aid of 
antisera to proteins originally studied in connection with IFN-a gene stimulation 
5 (7 12 15), the 91 kD ISGF-3 protein has been assigned a prominent role in IFN-y 
gene stimulation as well. The evidence for this conclusion included: 1) antisera 
specific to the 91 kD protein affected the IFN-7 dependent gel-shift complex, and 
2) A 91 kD protein could be cross-linked to the GAS IFN-y activated site. 3) A 
-S-labeled 91 kD protein and a 91 kD immunoreactive protein specifically purified 
10 with die gel-shift complex. 4) The 91 kD protein is an IFN-7 dependent tyrosine 
kinase substrate as indeed it had earlier proved to be in response to IFN-a (15). 
5) The 91 kD protein but not the 1 13 kD protein moved to the nucleus in response 
to IFN-7 treatment. None of these experiments prove but do strongly suggest that 
the same 91 kD protein acts differently in different DNA binding complexes that 
15 are triggered by either IFN-a or IFN-7- 

These results strongly support the hypothesis originated from studies on IFN-a that 
polypeptide cell surface receptors report their occupation by extracellular ligand to 
latent cytoplasmic proteins that after activation move to the nucleus to trigger 
20 transcription (4,15,21). Furthermore, because cytoplasmic phosphorylation and 
factor activation is so rapid it appears likely that the functional receptor complexes 
contain tyrosine kinase activity. Since the IFN-7 receptor chain that has been 
cloned thus far (22) has no hint of possessing intrinsic kinase activity, perhaps 
some other molecule with tyrosine kinase activity couples with the IFN-7 receptor. 
25 Two recent results with other receptors suggest possible parallels to the situation 
with the IFN receptors. The trk protein which has an intracellular tyrosine kinase 
domain, associates with the NGF receptor when that receptor is occupied (23). In 
addition, the Ick protein, a member of the src family of tyrosine kinases, is 
co-precipitated with the T cell receptor (24). It is possible to predict that signal 
30 transduction to the nucleus through these two receptors could involve latent 
cytoplasmic substrates that form part of activated transcription factors. In any 
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event, it seems possible that there are kinases like trie or Ick associated with the 
IFN-7 receptor or with IFN-a receptor. 

With regard to the effect of phosphorylation on the 91 kD protein, it was 
5 something of a surprise that after IFN-7 treatment the 91 kD protein becomes a 
DNA binding protein. Its role must be different in response to IFN-a treatment. 
Tyrosine is also phosphorylated on tyrosine and joins a complex with the 1 13 and 
84 kD proteins but as judged by UV cross-linking studies (7), the 91 kD protein 
does not contact DNA. 



10 



15 



In addition to becoming a DNA binding protein it is clear that the 91 kD protein is 
specifically translocated the nucleus in the wake of IFN-7 stimulation. 

EXAMPLE 3: TYROSINE 701 IS PHOSPHORYLATED 
IN THE 91 k P PROTEIN 



It has previously been shown that IFN-7 stimulates phosphorylation of the 91 kD 
protein. Thermolysin digestion of «P-labeled 91 kD protein from IFN- 7 -treated 
cells yielded a single peptide labeled on tyrosine. The 91 kD protein contains 19 
20 tyrosines (12). and to determine the location of the phosphorylated residue or 
residues, a tryptic digest of ^-labeled 91 kD protein from IFN-7-treated cells 
(FIGURE 4A) was examined. IFN-7 induced phosphorylation of a single tryptic 
peptide (X) on tyrosine. Peptide X was recovered and stepwise Edman 
degradation done. The labeled phosphotyrosine was released in the fourth 
25 degradative cycle (FIGURE 4B). Computer alignment of all the potential tryptic 
peptides showed a single peptide (ammo acids 698 to 703) in which tyrosine was 
the fourth amino acid, revealing this peptide as the major candidate for Un- 
stimulated tyrosine kinase action (FIGURE 4C). Note that the original sequence 
of the 91 kD protein omitted an 1 1 amino acid segment from residues 261 to 271 
30 Thus, the putative phosphorylated peptide contained a single tyrosine at residue 
701. confirming the expectation of phosphorylation at tyrosine 690 under the 
incorrect numbering system. 
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aqi m 707 was prepared. This 
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are transited to produce ihe 91 kD pr ^ ^ kD 
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protein was also consmicled. 

introduced by permanent transfection into U3A cells, which 
20 These construe, were mir*^ bp ^ ^ ^ fc g< m 
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type parental cell (2fTGH) or the C91 cell was phosphorylated on tyrosine when 
treated with IFN-7 (FIGURE 5C). This experiment confirmed that residue 701 is 
the sole site on the 91 kD that is phosphorylated on tyrosine in response to IFN-t- 

5 An experiment was performed to determined whether the 84 kD protein was 

phosphorylated on the same site as the 91 kD protein. C84 cells were labeled with 
32 P and treated with IFN-7; the 32 P-Iabeled 84 kD protein was immunoprecipitated 
and cleaved with trypsin. The resulting tryptic phosphopeptides were analyzed by 
2D phosphopeptide mapping (FIGURE 5B). A major spot was identified that 

10 migrated similarly to peptide X from the 91 kD protein (FIGURE 4A). When 
mixed, the two peptides migrated identically. Thus, it was concluded that the 84 
kD protein is also tyrosine phosphorylated on Tyr 7( " in response to IFN-7. 



The function of the 91 kD protein and the 84 kD proteins and the Tyr 71 ' 1 - Phe 701 
15 mutant was tested in various steps in the signal transduction pathway that results in 
IFN-Y-dependent gene activation. Removal of phosphate from the 91 kD protein 
phosphoprotein by calf intestinal phosphatase or inhibition of in vivo 
phosphorylation with staurosporine abolishes the 91 kD protein DNA binding 
activity. The IFN-7-dependent DNA protein complex, GAF, was detected in the 
20 wild-type parental cells (2fTGH) and in C91 cells (FIGURE 6A). The C84 cells 
also responded to IFN-7, yielding a DNA-protein complex that migrated somewhat 
faster, as would be expected for a smaller protein (FIGURE 6A). In contrast, 
cells expressing the Tyr 701 mutant (Cty) failed to produce an IFN-7-dependent 
DNA binding protein. 

25 

IFN-7-induced translocation to the nucleus was also tested. Immunofluorescence 
in C91 or C84 cells detected throughout the cell before IFN-7 treatment increased 
in the nucleus after IFN-7 treatment (FIGURE 6). In contrast, the Tyr 701 mutant 
protein did not move to the nucleus in response to IFN-7, suggesting that 
30 phosphorylation on Tyr 7 " 1 is required for the nuclear translocation of the 91 kD 
protein (FIGURE 6). 
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1J3 cells were transiently transfected with the 91 and 84 kD proteins, and the 
Tyr™ mutant protein, and the transcriptional response to IFN- 7 was measured in 
these cells. A target gene was constructed containing luciferase as the reporter 
and bearing one copy of the binding site for the 91 kD phosphoprotein upstream of 
5 an RNA start site otherwise lacking promoter elements. Cells transfected w.th the 
target gene and the wild-type 91 kD protein expression vector showed a 5- to 10- 
fold stimulation of luciferase expression when treated with IFN-7 (FIGURE 7). 
The IFN-y-dependent transcriptional activation required the presence of the 91 kD 
protein- IFN- 7 did not enhance transcription in U3A cells transfected with the 
10 reporter vector alone or a vector lacking the GAS site. Cells transfected with the 
reporter vector and the Tyr™ mutant did not respond to IFN- 7 , suggesting a 
requirement for phosphorylation for gene activation. Protein immunoblot analysts 
indicted that the 91 kD, 84 kD, and Tyr™ mutant prote.ns were expressed during 
the transient transfection (FIGURE 7). Similar experiments done in human kidney 
15 293 cells support the same condus,on. The results with transient transfections are 
in accord with findings that in U3A cells accumulation of mRNA from endogenous 
cellular genes in response to IFN- 7 requires the 91 kD protein (30). In those 
experiments, also, the 84 kD protein failed to direct the IFN-7 response. 



20 



FYAMPLE 5- THE ARC Wl3 RESIDUE IN THE 91KD SH2 DOMAIN IS 
EXAMPLE 5. ™ ^ FOR Ty? ^ punQPHORYI ATfON 



The 91 kD protein has a sequence from Try- to Pro- that resembles SH2 
25 domains (38), amino acid regions known bind tightly to tyrosine phosphates (39). 
Since ligand activated kinases often present a phosphotyrosine to a substrate, we 
tested die requirement for the SH2 domain in the 91 kD protein in ligand-mediated 
phosphorylation. The Arg'» residue in the v-src SH2 domain is crucial for direct 
interaction between a phosphotyrosine residue in the v-src SH2 domain (40, 41) 
30 and Arg« c of the kD protein is in a comparable position within the SH2 homology 
(38). We therefore changed the 91kD protein cDNA to encode Leu"« instead of 
Arg"" 2 and inserted the new sequence into an expression vector. U3A cells, an 
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IFN-a and IFN-7 unresponsive cell line (29) which lacks the mRNA for the 91kD 
protein and 84kD proteins (30) were transfected with expression vectors. Two 
stable cell lines were selected that express the Arg«°->Leu mutuant protein. The 
mutuant protein immunoprecipitated from these cell lines was not phosphorylated 
5 on tyrosine in response to IFN-7 (Figure 8b); thus a functional SH2 domain is 
required for the tyrosine phosphorylation of the 91kD protein suggesting that the 
kinase to which the substrate binds might in its active state have a tyrosine 
phosphate. 



10 



DISCUSSION 



As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
investigations with particular stimuli. Particularly, the present disclosure is 
15 illustrated by the results of work on protein factors that govern transcriptional 

control of IFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. 

For example, the above represents evidence that the 91kD protein is the tyrosine 
20 kinase target when 1FN 7 is the ligand. Thus two different ligands acting through 
two different receptors both use these family members. With only a modest 
number of family members and combinatorial use in response to different ligands, 
this family of proteins becomes an even more likely possibility to represent a 
general link between ligand-occupied receptors and transcriptional control of 
25 specific genes in the nucleus. 

It is proposed that other members of the 1 13-91 protein family will be identified as 
phosphorylation targets in response to other ligands. If as is believed, the tyrosine 
phosphorylation site on proteins m this family is conserved, one can then easily 
30 determine which family members are activated (phosphorylated), and likew.se the 
part.cular extracellular polypeptide ligand to which that family member is 
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responding. The modifications of these proteins (phosphorylation and 
dephosphorylation) enables the preparation and use of assays for determining the 
effectiveness of pharmaceuticals in potentiating or preventing intracellular 
responses to various polypeptides, and such assays are accordingly contemplated 
5 within the scope of the present invention. 

Earlier work has concluded that DNA binding protein was activated in the cell 
cytoplasm in response to IFN- 7 treatment and that this protein stimulated 
transcription of die GBP gene (10,14). In the present work, with the aid of 

10 anusera to proteins originally studied in connection with IFN-a gene stimulation 
(7,12,15), the 91 kD ISGF-3 protein has been assigned a prominent role in IFN-7 
gene stimulation as well. The evidence for this conclusion included: 1) antisera 
specific to the 91 kD protein affected the IFN-7 dependent gel-shift complex, and 
2) A 91 kD protein could be cross-linked to the GAS IFN-7 activated site. 3) A 

15 3S S-labeled 91 kD protein and a 91 kD immunoreactive protein specifically purified 
with the gel-shift complex. 4) The 91 kD protein is an IFN-7 dependent tyrosine 
kinase substrate as indeed it had earlier proved to be in response to IFN-a (15). 
5) The 91 kD protein but not the 113 kD protein moved to the nucleus in response 
to IFN-7 treatment. These experiments prove but do strongly suggest that the 

20 same 91 kD protein acts differently in different DNA binding complexes that are 
triggered by either IFN-a or IFN-7. 

These results strongly support the hypothesis originated from studies on IFN-a that 
polypeptide cell surface receptors report their occupation by extracellular ligand to 

25 latent cytoplasmic proteins that after activation move to the nucleus to trigger 
transcription (4,15,21). Furthermore, because cytoplasmic phosphorylation and 
factor activation is so rapid it appears likely that the functional receptor complexes 
contain tyrosine kinase activity. Since the IFN-7 receptor chain that has been 
cloned thus far (22) has no hint of possessing intrinsic kinase activity, perhaps 

30 some other molecule with tyrosine kinase activity couples with the IFN-7 receptor. 
Two recent results with other receptors suggest possible parallels to the situation 
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with the IFN receptors. The trk protein which has an intracellular tyrosine kinase 
domain associates with the NGF receptor when that receptor is occupied (23). In 
addition, the Ick protein, a member of the src family of tyrosine kinases, is 
co-precipitated with the T cell receptor (24). It is possible to pred.ct that signal 
5 transduction to the nucleus through these two receptors could involve latent 
cytoplasmic substrates that form part of activated transcription factors. In any 
event, it seems possible that there are kinases like trk or Ick associated with the 
IFN-7 receptor or with IFN-a receptor. 

10 With regard to the effect of phosphorylation on the 91 kD protein, it was 

something of a surprise that after IFN-7 treatment the 91 kD protein becomes a 
DNA binding protein. Its role must be different in response to IFN-a treatment. 
There it is also phosphorylated on tyrosine and joins a complex with the 113 and 
84 kD proteins but as judged by UV cross-linking studies (7), the 91 kD protein 

15 does not contact DNA. 

In addition to becoming a DNA binding protein it is clear that the 91 kD protein is 
specifically translocated the nucleus in the wake of IFN-7 stimulation. While the 
present work strongly implicates the 91 kD protein as important in the immediate 
20 IFN-7 transcriptional response of the GBP gene, two points should also be clear. 
First it is not known whether the 91 kD protein acts on its own to activate 
transcription. Second, it is not known how widely used the 91 kD protein is in 
the immediate IFN-7 transcriptional response. Only a few genes have been 
studied that are activated immediately by IFN-7 without new protein synthesis. It 
25 is at present uncertain whether activation of these genes operates through the 91 
kD binding site. 

The present examples demonstrate that phosphorylation of Tyr™ on the 91 kD 
protein induces nuclear translocation and DNA binding of the protein. 
30 Presumably, the phosphorylated 91 kD protein directly or indirectly activates 
transcription in response to IFN-8. This function of the phospho-91 kD protein 
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has been indirectly confirmed by the inability of a non-phosphorylated mutant 91 
kD protein to induce transcription. 

It was found that endogenous genes normally induced by 1FN- T cannot be induced 
5 in U3A cells complemented with the Tyr-Phe™ mutant protein. However, U3A 
cells respond to IFN-« when transfected with either the 91 or 84 kD proteins. 
Thus the 84 kD protein can fulfill the required role in the multimenc ISGF-3 
complex induced by IFN- 7 in which either the 84 or 91 kD protein joins with the 
113 kD protein and a 48 kD DNA binding protein (30). Cells reconstituted with 
10 the Tyr-Phe- mutant protein cannot form ISGF-3 nor do IFN-a-induced mRNAs 
accumulate in such cells. 

After IFN-7 treatment, the 84 kD protein acts in parallel with the 91 kD protein 
up to the point of gene activation: the 84 kD protein can be phosphorated and 
15 translocated and b.nds to DNA. However, only the 91 kD protein acts by itself as 
a direct DNA binding protein capable of transcriptional activation. These results 
suggest that the 38 COOH-terminal amino acids of the 91 kD are essential for 
activation of transcription through a GAS site. It is possible that the 84 kD 
protein functions to regular activity of the 94 kD protein. 



20 



PVAMPI.E 6: DJMEBF'Timi OF PH^punpVi A TFT) STAT91 



Stat91 (a 91 kD protein that acts as a signal transducer and activator of 
transcription) is inactive in the cytoplasm of untreated cells but is activated by 

25 phosphorylation on tyrosine in response to a number of polypeptide ligands 
including IFN-a and IFN- 7 . This example reports that inactive Stat91 in the 
cytoplasm of untreated cells is a monomer and upon IFN-7 induced 
phosphorylation it forms a stable homodimer. The dimer is capable of binding to 
a specific DNA sequence directing transcription. Dissociation and reassociation 

30 assays show that dimerizat.on of Stat91 is mediated through SH2- P hos P hotyrosyl 
peptide interactions. Dimerization involving SH2 recognition of specific 
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production of a long form of Stat91 with a C-ternnnal tag 
encoded by PMNC vector. 

20 con nlasmids were constructed by the using the pGEX- 

GST fusion protein expression plasmids w 

x rsT Q1SH2 encodes ammo acids 573 to t>u oi o 
2T vector <Pharmac,a>. GS ™ SH2 wj , h an Arg .«, 2 -> Un- 

GS T-91mSH2 encodes ammo actds 573 to 672 of 
M2 mutation; and GST-91SH3 encodes ammo acds 506 564 

m e*od. and s*ce,nmes were seiected mD^co smo edEa.es 

■ rAlR (0 5 ma/ml, Gibco), as described (45). 
medium containing G418 <U.3 mg/i 
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Preparation of Cell Extracts. Crude whole cell extracts were prepared as 
described (31). Cytoplasmic and nuclear extracts were prepared essenual.y as 
described (46). 

5 P»nfica«o, Affinity punficaoon w.m a b,o,my.a,ed oligonucleoride was 

described (3.). The seance of the bio,,nyla.ed GAS oligonuc.eo.ide was from 
the Ly6E gene promoter (34). 

* CI to**- A — ' 

,0 weigh, marker Id. wi.h a range of mo,ecu,ar we.ghu fro. 14 * 545 kD was 
otaamed from Sigma. Determ.nmg moiecular weighs using nondena.ur.ng 
po.yacryia.ide gei was earned on. Mowing rr.anufae.nre.-s procedure. wh.ch 
„ . modif.ca.ion of ,he me.hods of Bryan and Davis (47. 48,. Phospho.yla.ed and 
unphospho.yla.ed S«91 sampies chained from affiniry puriflca<ion usmg a 

15 b,o,inyia.ed GAS o.igonucleo.ide (31) were suspended in a buffer comammg 10 
mM Tris <pH 6.7), 16% glycerol, 0.04% brompherol blue (BPB). The mi«ures 
were analyzed on 4.5%. 5.5%, 6.5%, and 7.5.% na,ive gels side by side w..h 
sandard markers usmg a Bio-Rad m,ni-Pro.ean I. Cell elecophoresis sys.em 
Elecrophoresis was s.opped when .he dye (BPB) reached .he bouonr of .he gels. 

20 The molecular size markers were revealed by Coomassie blue su.in.ng. 
Phosphoryla,ed and unphosphor y la.ed S.a.91 samples were de.ec.ed by 
immunoblotting with anti-91T. 

Glycerol Gradient Analysis. Cells extracts (Bud 8) were mixed with protein 
25 standards (Pharmaoa) and subjected to centrifugauon through preformed 10%- 
40% glycerol grades for 40 hours at 40,000 rpm in an SW41 rotor as desenbed 

(6). 

Gel MoW Shift Assays. Gel mobility shift assays were carried out as described 
30 (34) An oHgonucleot.de corresponding to the GAS element from the human 
Fc 7 Rl receptor gene (Pearse et al. 1993) was synthesized and used for gel 
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mobility shift assays. The oligonucleotide has the following sequence: 
5'GATCGAGATGTATTTCCCAGAAAAG3' (SEQ ID NO: 14). 

Synthesis of Peptides. Solid phase peptide synthesis was used with either a 
5 DuPont RAMPS multiple synthesizer or by manual synthesis. C-terminal ammo 
attached to Wang res.n were obtained from DuPont/NEN. All ammo acids were 
coupled as the N-Fmoc penttfluorophenyl esters (Advanced Chemtech), except for 
N-Fmoc, PO-dimethyl-L-phosphotyrosine (Bachem). Double couplings were used. 
Cleavage from resin and deprotection used thioamsoI/m-cresolATFA/TMSBr at 
10 4°C for 16 hr. Purification used C-18 column HPLC with 0.1% TFA/acetonitrile 
gradients. Peptide, were characterized by 'H and «P NMR, and by Mass Spec, 
and were greater than 95 % pure. 

Guanidium Hydrochloride Treatment. Extracts were incubated with guamd.um 
15 hydrochloride (final concentration was 0.4 to 0.6 M) for two min. at room 

temperature and then diluted with gel shift buffer (final concentration of guanuhum 
hydrochloride was 100 mM) and incubated at room temperature for 15 mm. '-P- 
,abeled GAS oligonucleot.de probe was then added directly to the mixture followed 
by gel mobility shift assay. 

20 

Dissociation-reassociation Analysis. Extracts were incubated with var.ous 
concentrates of pept.des or feion protems, and labeled GAS oUgonucleot.de 
probe in gel shift buffer was then added to promote the formation of protem- 
DNA complex followed by mobility shift analysis. This assay did not .nvolve 
25 guanidmm hydrochloride treatment. 

Preparation of Fusion Proteins. Bactenally expressed GST fusion proteins were 
purified using standard techniques, as described in B.rge et .1., 1992. Fus.on 
protems were quantified by O.D. absorbance at 280nm. AHouotes were frozen 
30 at -70°C. 
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Results 

Deration Lfeflwi W»«r Formation of Stat91 in Solution. In untreated 

cells, Stat91 is not phosphorylated on tyrosine. Treatment with IFN- 7 leads 
5 within minutes to tyrosine phosphorylation and activation of DNA binding 

capacity. The phosphorylated form migrates more slowly during electrophoresis 
under denaturing conditions affording a simple assay for the phosphoprotein (31). 

To determine the native molecular weights of the phosphorylated and 
10 unphosphorylated forms of Stat91 , we separated them by affinity purification using 
a biotinylated deoxyoligonucleotide containing a GAS sequence (interferon gamma 
activation site) (Figure 9A). The separation of phosphorylated Stat91 from the 
unphosphorylated form was efficient as almost all detectable phosphorylated form 
could bind to the GAS site while unphosphorylated Stat91 remained unbound. To 
15 determine the molecular weights of the purified phosphorylated Stat91 and 

unphosphorylated Stat91, samples of each were then subjected to electrophoresis 
through a set of nondenaturing gels containing various concentrations of 
acrylamide followed by Western blot analysis (Figure 9B). Native protein size 
markers (Sigma) were included in the analysis. 



20 



This technique was originally described by Bryan (48) and was recently used for 
dimer analysis (49). The logic of the technique is that increasing gel 
concentrations affect the migration of larger proteins more than smaller proteins, 
and the analysis is not affected by modifications such as protein phosphorylation 
25 (49). 

A function of the relative mobilities (Rm) was plotted versus the concentration of 
acrylamide for each sample to construct Ferguson plots (Figure 9C). The 
logarithm of the retardation coeffic.ent (calculated from Figure 9C) of each sample 
30 was then plotted against the logarithm of the relevant molecular weight range 

(F.gure 9D). By extrapolation of its retardation coefficient (Figure 9D). the native 
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15 



, , h , „ Sta,9, from untreated c* was esttmated » * 
molecular we.ght of Stat9> ^ ^ ^ twice as 

95 kD , * tyrostne Phosp moiKo , ar welgh , from 
, arge , or appro >,rna.e y ^ B ^ ^ ^ ^ SDA 

arniw) acid sequence of Stat91,sS/ ' and rcfs . 12 a™) 45), 

5 ge ,s with an apparent molecular weigh, of 9 » 
L concluded that in «•*»• unphosphorylated Sta,91 
w hile.yrosin=phosphoryla K dS«91isad,mer. 

W e also employed glycerol gradient ana,ys,s to 

a. gtadten, were collected and , wrfI form, of 
s „,ft analysis (Figure .OA and 10B). A e*P s , ow . mlgrat ,„ g 
St a t „ could be detected by immunoblotttng <F,gu »M 

:: :::i e : m p ^ * — .ci;;::!, 90 

10C , su p P „ r „ ng con « ^ ^ ^ ^ , 

monomer ,n soiu .on wh I ^ ^ 

Wben fraction, from > e ^ U ^ form of ^ 

mobility shift analysis (Ftgure 10B». P TmBon , ym e 
25 correlated well with the DNA-binding act,v,,y o . Thu J 
p „osphoryla,ed dimeric Stat9, has the seouence-spectftc 
capacity. 

S,a#l B,„ds DNA ^ band dram| , gd 
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different species prov.es evidence of aim— of the DN A binding 

„ceS,a.9, requires specific tyrosine phosphorylation in llgand-treated 

::: :::::::: — - — - - ■ 

L in transfectet, ce,., An expression vector (MNC9U) encode Sta* L. 

5 :^rntn,o f 

»»h mie extra ammo acids were encoaeu u> a * fe 
terminal tag was generated. [The extra am 

DNA seance from plasmid pMNC (see Mater,, and Methods J A S^4 
express.cn vector (MNC84) was also available (45). ^ ~ 
experiments mutant human cell lines <U3) are known that lack the S«91/» 
expenments. The U3 cells were therefore separately transfected 

in mRNA and proteins (29,3U). meuji 

porting Stat84 (MNC84) or Stat91L (MNC91L) or a mixture of 
with vectors encoding StatM ^mi>^ ; 

** vectors. Per— transfer expressing Stat84 (C84), Stat9,L <C9,U 
bo* proteins (Crax) were isolated (Figure 11 A). 

(Figure l IB) „ Most lmp ortantly, extracts from 

shift band than extracts of treated UNLceiis. i- 

>„,i, statfu and Stat91L proteins formed an 
IFN-rtreated Cmx cells expressing both Stat84 and btatv P 

, u-a u onf1 Anti-91 an antiserum against me u- 
additional intermediate gel shift band. Ant, VI. 

2 „ lerm ,nal 3S amino acids of Sta,91 (12, that are absent in ^ — 
.moved the top two shift bands seen with the Cmx extracts. An.9U 
antiserum against amino acids 609 to 7,6 (15) that recogn.es both Sta,9.L a d 
Z* Prceins inhibited the binding of all three shift bands. Thus, the middle 
r'oLed b, extracts of the Cmx ce„s ,s clear,, ident^ 1 as , he^r of 
25 Su ,84 and Sta,9,L. We concluded tha, both Sta,9, and Sla, 8 4 bind DNA as 
homodimers and. if present in the same cell, wil, form heterodimer, 

We next wanted to detect the formation of dimers ,„ *■>. When cytoplasm, or 
nnclear extracts of ,FN-,-<reated «4 or C91L cells were mixed «, analyzed 
30 (Figure 12). only the fas, or slow migrating gel shift bands were observed. Thus 
^ tha, once formed * W . the dimers were stable. To promote the 
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10 



either cytoplasmic or nuclear extracts of IFN-7-tna"* 
*I 1 —ration — • extracts were »* 0.5M 

rera ,„ration an, subsea,e„t„ „ S ed f or g e, — ^J^, ^ 

™* band did no, form. The .ntermediate band was a 8 am proven b y 
limn, — to cons,, of Sta,84,Sta,9, L dimer (d- no. shown,. 

This experiment defined conditions under which ,he dnner was stable, bu, also 

1 5 non-covalent interactions . 

• • f «at91 Involves Phosphotyrosyl Peptide and SH2 Interactions. 

forms oi a no the su bunits of the dimer should 

wi th me d,mer ta ,io„ domain(s) of the protem. »he a s c p 

* — — ° f dna ,o 

Ttb II^NA complex shou,d ,ead to the detection of holers 
form the stable p otetn " of fc 6mmg agen, s U bun,^ 

as „ „ Iteterod^ * ~« ^ ^ . 

30 of the dimer may not be able to re ior 
detected (Figure 13). 
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The Stat91 sequence contains an SH2 domain (amino acids 569 to 700, see 
discussion below), and we knew that Tyr-701 was the single phosphorylated 
tyrosine residue required for DNA binding activity (supra, 45). Furthermore, we 
have observed that phosphotyrosine at 10 mM, but not phosphoserine or 
5 phosphothreonine, could prevent the formation of Stat91-DNA comp.ex. We 
therefore sought evidence that the dimension of Stat91 involved specific SH2- 
phosphotyrosine interaction using the dissociation and reassociation assay. 

In order to evaluate the role of the SH2- P hosphotyrosme in dimerization, two 
10 peptides fragments of Stat91 corresponding to segments of the SH2 and 

phosphotyrosing domains of Stat91 were prepared: a non-phosphorylated pept.de 
(91Y), LDGPKGTGYIKTEL1 (SEQ ID NO:15) (corresponding to ammo acids 
693-707), and a phosphotyrosyl peptide (91Y-p), GY*IKTE (SEQ ID 
NO: 16) (representing residues 700-705). 

Activated Stat84 or Stat91L was obtained from IFN- 7 -treated C84 or C91L cells 
and mixed in the presence of various concentrations of the peptides followed by 
gel mobility shift analysis. The non-phosphorylated peptide had no effect on the 
presence of the two gel shift bands characteristic of Stat84 or Stat91L homodimers 
20 (Figure 14. lane 2-4). In contrast, the phosphorylated peptide (91Y-p) at the 
concentration of 4 „M clearly promoted the exchange between the subunits of 
Stat84 dimers and Stat91L dimers to form heterodimers (Figure 14. lane 5). At a 
higher concentration (160 M M). peptide 91Y- P but not the unphosphorylated 
peptide dissociated the dimers and blocked the formation of DNA protein 
25 complexes (Figure 14, lane 7). 

When cells are treated with IFN-a both Stat91 (or 84) and Start 13 become 
phosphorylated (15). Antiserum to Statll3 can precipitate both Start 13 and Stat91 
after IFN-a-treatment but not before, suggesting IFN-a dependent interaction of 
30 these two proteins, perhaps as a heterodimer (15). 
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In StatllS, tyr-690 in the homologous position to Tyr-701 in Stat91 is the single 
target residue for phosphorylation. Amino acids downstream of the affected 
tyrosine residue show some homology between the two proteins. We therefore 
prepared a phosphotyrosy. peptide of Statll3 (113Y-p), KVNLQERRKY*LKHR 
5 (SEQ ID NO: 17) [ammo acids 681 to 694; (38)]. At concentrations similar to 
91Y- P lBY-p also promoted the exchange of subumts between the Stat84 and 
Stat91L, while at a high concentration (40 M M), U3Y-p prevented the gel shift 
bands almost completely (Figure 14, lane 8-10). 

10 We prepared a phosphotyrosyl peptide (SrcY-p), EPQY*EEIPIYL (SEQ ID 

NO:18) which is known to interact with the Src SH2 domain with a high affinity 
(50) This peptide showed no effect on the Stat91 dimer formation (Figure 14, 
lane 1 1-13). Thus, it seems that Stat91 dimerization involves SH2 interaction with 
tyrosine residues in specific peptide sequence. 

To test further the specificity of Stat91 dimerization mediated through specific- 
phosphotyrosyl-peptide SH2 interaction, a fusion product of glutathione-S- 
transferase with the Stat91-SH2 domain (GST-91SH2) was prepared (Figure ISA) 
and used in the in vitro dissociation reassociat.on assay. At concentrations of 0.5 
20 to 5 „M, the Stat91-SH2 domain promoted the formation of a heterodimer (Figure 
15B, lanes 5-7). In contrast, neither GST alone, nor fusion products with a 
mutant (R**->L* 2 ) Stat91-SH2 domain (GST 91mSH2) that renders Stat91 non- 
functional in vivo, a Stat91 SH3 domain (GST-91SH3), nor the Src SH2 domain 
(GST-SrcSH2), induced the exchange of subunits between the Stat84 and Stat91L 
25 homodimers (Figure 15B). 



Discussion 



The initial sequence analys.s of the Stat91 and Stall 13 proteins revealed the 
30 presence of SH2 like domains (see 13,38). Further it was found that STAT 
proteins themselves are phosphorylated on single tyrosine residues during their 
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served "pocke." regton of .he SH2 ^ ^ 

^ i, seemed * «y « « ^' '* rf ^ JAK WMSK . 

- , STATs have phosphotyrosine residue and SH2 domains, a 
Since the activated STATs nave p Drmdn . prot ein interactions wtthm 

^ond suggest roie for SH2 domatns - ^ ^ ge , s ^ 

10 !c d.menmionongrad,en K -- peptides frotn Su«l or 

- * ' «~ " 9 .' ' f ^ 1, eff.cien.iy pronto. me — of 
SB ,„3 and the SH. domam f ^ ^ ^ ^ 

heterod.mers between Sm.91L an SH 2.phospho<yrosyl 
assay, we conclude .ha. dimer,za..on of S.at91 
15 peptide interactions. 

u , A in Sta.91 was indicated initially by the presence 

Th e possibiiity of an SH 2 do« ■ ^ ^ ^ ^ 

of mghiy conserved ammo a s,re* ^ ^ ^ 

intte 569 ,„ 700 resume regton, severa ^ of . SH2 dom ai„, The 

20 in the amino terminal end of .he reg.on. a ^ ^ ^ 

was also true for the STAT protems ^ 

^veral proteins (F.gme 16) is based on .hese. 

■ «, 1„ »M,,s preceded b, hydrophihc residues and is foliowed 
Tne characters W On »A» ^ ^ w ^ ^ even „ 

by bydrophobtc ^ ^ s J, js shlte0 , n s.,9i. The three 
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the portions indeed as al P haA2, betaB5, and betaDS. Figure 16 shows an 
alignment which accomplishes this by msertmns in the 'AA< and 'CD' regions. 
This is a different alignment from that previously suggested (38), and grves a 
satisfactory alignment in the (beta)D region, although, tike the previous alignment, 
5 it is obviously considerably less similar to the other SH2's in the C-terminus. 

This alignment suggests that the SH2 domain in the Stat91 would end in the 
V1 cinity of residue 700. In such an alignment, the Tyr-701 occurs almost 
immediately after the SH2 domam: a distance too short to allow an ir^olea^ 
10 phosphotyrosine -SH2 interaction. Since the data presented earlier strongly 

implicate that an SH2-phos P hotyrosine interaction is involved in dimerization, such 
an interaction is likely to be between two phospho Stat91 subunits as a reciprocal 
pTyr -SH2 interaction. 

15 The apparent stability of Stat91 dimer may be due to a high association rate 

coupled with a high dissoc.at.on rate of SH2-phos P hotyrosyl peptide interacts as 
suggested (Felder et al., 1993, Mol. Cell Biol. 13:1449-1455) coupled with 
interactions between other domains of Stat91 that may contribute stabihty to the 
Stat91 dimer. Interference by homologous phosphopept.des with the -SH2- 

20 phosphotyrosine interaction would then lower stability sufficiently to allow 
complete dissociation and heterodimerization. 

The d.mer formation between phospho Stat91 is the first case in eukaryotes where 
dimer formation is regulated by phosphorylation, and the only one thus far 
25 dependent on tyrosme phosphorylation. We anticipate that d.menzauon w.th the 
STAT protein family will be important. It seems likely that in cells treated with 
lFN-« there is Statll3-Stat91 interaction (15). This may well be mediated 
through SH2 and phosphotyrosyl peptide interactions as described above, leading 
to a complex (a probable dimer of Stat91-Statl 13) which join* with a 48 kD DNA 
30 binding protein (a member of another family of DNA binding factors) to make a 
complex capable of binding to a different DNA site. Furthermore, two mouse 
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(i) APPLICANT: The Rockefeller University 

(ii) TITLE OF INVENTION: RECEPTOR RECOGNITION FACTORS, PROTEIN 
SEQUENCES AND METHODS OF USE THEREOF 

(iii) NUMBER OF SEQUENCES: 22 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Klauber & Jackson 

(B) STREET: 411 Hackensack Avenue 

(C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP : 07601 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: WO not yet assigned 

(B) FILING DATE: 26-SEP-1994 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/212,185 

(B) FILING DATE: ll-MAR-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 06/212,184 

(B) FILING DATE: ll-MAR-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US OB/126,588 

(B) FILING DATE: 24-SEP-1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/126,595 

(B) FILING DATE: 24-SEP-1993 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Jackson Esq., David A. 

(B) REGISTRATION NUMBER: 26,742 

(C) REFERENCE/DOCKET NUMBER: 600-1-073 PCT 

(IX) TELECOMMUNICATION INFORMATION: 

(A) 'TELEPHONE: 201 487-5800 

(B) TELEFAX: 201 343-1684 

(C) TELEX: 133521 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3268 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 



ami - - 
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(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: HeLa 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 25.. 2577 



51 



99 



147 



195 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ACTGCAACCC TAATCAGAGC CCAA ATG GCG CAG TGG GAA ATG CTG CAG AAT 

Met Ala Gin Trp Glu Met Leu Gin Asn 
1 5 

CTT GAC AGC CCC TTT CAG GAT CAG CTG CAC CAG CTT TAC TCG CAC AGC 
Sp ter Pro Phe Gin Asp Gin Leu His Gin Leu Tyr Ser His Ser 
10 15 20 

CTC CTG CCT GTG GAC ATT CGA CAG TAC TTG GCT GTC TGG ATT GAA GAC 
Leu Leu Pro Val Asp He Arg Gin Tyr Leu Ala Val Trp lie Glu Asp 

30 35 

CAG AAC TGG CAG GAA GCT GCA CTT GGG ACT GAT GAT TCC AAG GCT ACC 
Gin Asn Trp Gin Glu Ala Ala Leu Gly Ser Asp Asp Ser Lys Ala Tnr 
45 50 « 

ATG CTA TTC TTC CAC TTC TTG GAT CAG CTG AAC TAT GAG TGT GGC CGT 2 43 

SI? Leu* Phe Phe Sts Phe Leu Asp Gin Leu Asn Tyr Glu Cys Gly Arg 

eo 65 

rrr ACC CAG GAC CCA GAG TCC TTG TTG CTG CAG CAC AAT TTG CGG AAA 291 
£S Ser Sn Asp Pro Glu Ser Leu Leu Leu Gin His Asn Leu Arg Lys 
75 eo 85 

TTC TGC CGG GAC ATT CAG CCC TTT TCC CAG GAT CCT ACC CAG TTG GCT 33 9 

Phe Cys Arg Asp lie Gin Pro Phe Ser Gin Asp Pro Thr Gin Leu Ala 
90 95 100 ■ LUD 

GAG ATG ATC TTT AAC CTC CTT CTG GAA GAA AAA AGA ATT TTG ATC CAG 387 
tlu Set lie Phe Asn Leu Leu Leu Glu Glu Lys Arg He Leu lie Gin 
110 115 x ' u 

rPT CAG AGG GCC CAA TTG GAA CAA GGA GAG CCA GTT CTC GAA ACA CCT 435 
111 Gin £g Ala G?n Leu Glu Gin Gly Glu Pro Val Leu Glu Thr Pro 

125 *-30 I-*- 3 

GTG GAG AGC CAG CAA CAT GAG ATT GAA TCC CGG ATC CTG GAT TTA AGG 463 
Val Glu Ser Gin Gin His Glu He Glu Ser Arg He Leu Asp Leu Arg 
140 145 150 

GCT ATG ATG GAG AAG CTG GTA AAA TCC ATC AGC CAA CTG AAA GAC CAG 531 
Ala Met Met Glu Lys Leu Val Lys Ser lie Ser Gin Leu Lys Asp Gin 
155 160 !G5 

CAG GAT GTC TTC TGC TTC CGA TAT AAG ATC CAG GCC AAA GGG AAG ACA . 579 

Gin Asp Val Phe Cys Phe Arg Tyr Lys He Gin Ala Lys Gly Lys Thr 
170 175 180 18b 
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CCC TCT CTG GAC CCC CAT CAG ACC AAA GAG CAG AAG ATT CTG CAG GAA 627 
Pro Ser Leu Asp Pro His Gin Thr Lys Glu Gin Lys lie Leu Gin Glu 
190 19^ 200 

ACT CTC AAT GAA CTG GAC AAA AGG AGA AAG GAG GTG CTG GAT GCC TCC 675 
Thr Leu Asn Glu Leu Asp Lys Arg Arg Lys Glu Val Leu Asp Ala Ser 
205 210 215 

AAA GCA CTG CTA GGC CGA TTA ACT ACC CTA ATC GAG CTA CTG CTG CCA 723 
Lys Ala Leu Leu Gly Arg Leu Thr Thr Leu He Glu Leu Leu Leu Pro 
220 225 230 

AAG TTG GAG GAG TGG AAG GCC CAG CAG CAA AAA GCC TGC ATC AGA GCT 771 
Lys Leu Glu Glu Trp Lys Ala Gin Gin Gin Lys Ala Cys He Arg Ala 
235 240 245 

CCC ATT GAC CAC GGG TTG GAA CAG CTG GAG ACA TGG TTC ACA GCT GGA 819 
Pro He Asp His Gly Leu Glu Gin Leu Glu Thr Trp Phe Thr Ala Gly 
250 255 260 265 

GCA AAG CTG TTG TTT CAC CTG AGG CAG CTG CTG AAG GAG CTG AAG GGA 867 
Ala Lys Leu Leu Phe His Leu Arg Gin Leu Leu Lys Glu Leu Lys Gly 
270 275 280 

CTG AGT TGC CTG GTT AGC TAT CAG GAT GAC CCT CTG ACC AAA GGG GTG 915 
Leu Ser Cys Leu Val Ser Tyr Gin Asp Asp Pro Leu Thr Lys Gly Val 
285 290 295 

GAC CTA CGC AAC GCC CAG GTC ACA GAG TTG CTA CAG CGT CTG CTC CAC 96 3 

Asp Leu Arg Asn Ala Gin Val Thr Glu Leu Leu Gin Arg Leu Leu His 
300 305 310 

AGA GCC TTT GTG GTA GAA ACC CAG CCC TGC ATG CCC CAA ACT CCC CAT 1011 
Arg Ala Phe Val Val Glu Thr Gin Pro Cys Met Pro Gin Thr Pro His 
315 320 325 

CGA CCC CTC ATC CTC AAG ACT GGC AGC AAG TTC ACC GTC CGA ACA AGG 1059 
Arq Pro Leu lie Leu Lys Thr Gly Ser Lys Phe Thr Val Arg Thr Arg 
330 335 340 345 

CTG CTG GTG AGA CTC CAG GAA GGC AAT GAG TCA CTG ACT GTG GAA GTC 1107 
Leu Leu Val Arg Leu Gin Glu Gly Asn Glu Ser Leu Thr Val Glu Val 
350 355 360 

TCC ATT GAC AGG AAT CCT CCT CAA TTA CAA GGC TTC CGG AAG TTC AAC 1155 
Ser He Asp Arg Asn Pro Pro Gin Leu Gin Gly Phe Arg Lys Phe Asn 
365 370 375 

ATT CTG ACT TCA AAC CAG AAA ACT TTG ACC CCC GAG AAG GGG CAG AGT 12 03 

no L ~u Th^ Ser Asn Gin Lvs Thr Leu Thr Pro Glu Lys Gly Gin Ser 

380 385 350 

CAG GGT TTG ATT TGG GAC TTT GGT TAC CTG ACT CTG GTG GAG CAA CGT 12 51 

Gin Gly Leu He Trp Asp Phe Gly Tyr Leu Thr Leu Val Glu Gin Arg 
395 400 405 

TCA GGT GGT TCA GGA AAG GGC AGC AAT AAG GGG CCA CTA GGT GTG ACA 12 9 9 

Ser Gly Gly Ser Gly Lys Gly Ser Asn Lys Gly Pro Leu Gly Val Thr 
410 415 420 425 

GAG GAA CTG CAP ATC ATC AGC TTC ACG GTC AAA TAT ACC TAC CAG GGT 3 34 7 

Glu Glu Leu His He He Ser Phe Thr Val Lys Tyr Thr Tyr Gin Gly 
430 435 440 

CTG AAG CAG GAG CTG AAA ACG GAC ACC CTC CCT GTG GTG ATT ATT TCC 13 95 

Leu Lys Gin Glu Leu Lys Thr Asp Thr Leu Pro Val Val He He Ser 
445 45C 455 



* * 13*4 • 
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JK S £S SS SS K SI | - - - - il ™ s - 
S I S S 2S S S K S SS S | E SS S S 

AAG GCC CCC TGG A=G JTG CTG GGC OCT OCT CTC ACT T=G CAG TTC TCC 

Lys Ala Pro Trp Ser Leu Leu Gly Pro Aia ^ f 5q5 

TCC TAT GTT GGC CCA GGC CTC AAC TCA GAC CAG CTG AGC ATG CTG AGA 
Ser Tyr Val Gly Arg Gly Leu Asn Ser Asp bin ueu ^ 
510 bli3 
AAG CTG TTC GGG CAG AAC TGT AGG ACT GAG GAT CCA TTA TTG TCC 
Asn Lys Leu Phe Gly Gin Asn Cys Arg Thr Glu Asp fro 
525 530 

S S S S S 5! S S5 K - S S g S S S 
S S S S S K S S SK S S2 SS SS ^ S K 
S 5 S S S i 5 S E Sg S S S S £ i 

GAG CGC CGG CTG CTG AAG AAG ACC ATG TCT GGC ACC TTT CTA CTG CGC 
Glu Arg Arg Leu Leu Lys Lys Thr Met Ser Gly Thr Phe be 
590 595 

£ SS s s sj s? 1 - s ss i ss ss 

605 610 
„»m ATvr rTP PTC ATC TAC TCT GTG CAA CCG TAC ACG AAG 

S Z ?2 S Tyr Ser Val Gin Pro Tyr Thr Lys 

620 625 

SK vS SS SS E SS SS SS S SS SS SS SS SS SS SS 
is SS £ SS ss s si SS SS SS S° ss SS SS SS f 

ss ss ss ss ss i ss ss s s s ss £s ss ss ss k 

i£ ss SS SS SS iSS % ss ss ss ss £ SS SS SS 

ss s is ss ss ss s ss ss ss ss ss ss ss ss is 
ss ss ss ss ss ss s ss ss ss ss ss ss ss ss ss 



L44 ■ 



1491 



1539 



1587 



1635 



1683 



1731 



1779 



1827 



1875 



1923 



1971 



2019 



2067 



2115 



216 3 



2211 
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GAG CCA GAG CTC AGC CTG GAC TTA GAG CCA CTG CTG AAG GuA GGu CTG 225 9 

Glu Pro Glu Leu Ser Leu Asp Leu GIu Pro Leu Leu Lys Ala Gly Leu 
730 735 740 745 

GAT CTG GGG CCA GAG CTA GAG TCT GTG CTG GAG TCC ACT CTG GAG CCT 2 30 7 

Asp Leu Gly Pro Glu Leu Glu Ser Val Leu Glu Ser Thr Leu Glu Pro 
750 755 760 

GTG ATA GAG CCC ACA CTA TGC ATG GTA TCA CAA ACA GTG CCA GAG CCA 2 35 5 

Val He Glu Pro Thr Leu Cys Met Val Ser Gin Thr Val Pro Glu Pro 
765 770 775 

GAC CAA GGA CCT GTA TCA CAG CCA GTG CCA GAG CCA GAT TTG CCC TGT 24 0 3 

Asp Gin Gly Pro Val Ser Gin Pro Val Pro Glu Pro Asp Leu Pro Cys 
780 785 790 

GAT CTG AGA CAT TTG AAC ACT GAG CCA ATG GAA ATC TTC AGA AAC TGT 2451 
Asp Leu Arg His Leu Asn Thr Glu Pro Met Glu He Phe Arg Asn Cys 
795 800 805 

GTA AAG ATT GAA GAA ATC ATG CCG AAT GGT GAC CCA CTG TTG GCT GGC 24 9 9 

Val Lys He Glu Glu He Met Pro Asn Gly Asp Pro Leu Leu Ala Gly 
810 815 820 825 

CAG AAC ACC GTG GAT GAG GTT TAC GTC TCC CGC CCC AGC CAC TTC TAC 2 54 7 

Gin Asn Thr Val Asp Glu Val Tyr Val Ser Arg Pro Ser His Phe Tyr 
830 835 840 

ACT GAT GGA CCC TTG ATG CCT TCT GAC TTC TAGGAACCAC ATTTCCTCTG 2 5 97 

Thr Asp Gly Pro Leu Met Pro Ser Asp Phe 
845 850 



TTCTTTTCAT 


ATCTCTTTGC 


CCTTCCTACT 


CCTCATAGCA TGATATTGTT 


CTCCAAGGAT 


2657 


GGGAATCAGG 


CATGTGTCCC 


TTCCAAGCTG 


TGTTAACTGT 


TCAAACTCAG 


GCCTGTGTGA 


2717 


CTCCATTGGG 


GTGAGAGGTG 


AAAGCATAAC 


ATGGGTACAG 


AGGGGACAAC 


AATGAATCAG 


2777 


AACAGATGCT 


GAGCCATAGG 


TCTAAATAGG 


ATCCTGGAGG 


CTGCCTGCTG 


TGCTGGGAGG 


2837 


TATAGGGGTC 


CTGGGGGCAG 


GCChGGGCAG 


TTGACAGGTA 


CTTGGAGGGC 


TCAGGGCAGT 


2897 


GGCTTCTTTC 


CAGTATGGAA 


GGATTTCAAC 


ATTTTAATAG 


TTGGTTAGGC 


TAAACTGGTG 


2957 


CATACTGGCA 


TTGGCCTTGG 


TGGGGAGCAC 


AGACACAGGA 


TAGGACTCCA 


TTTCTTTCTT 


3017 


CCATTCCTTC 


ATGTCTAGGA 


TAACTTGCTT 


TCTTCTTTCC 


TTTACTCCTG 


GCTCAAGCCC 


3077 


TGAATTTCTT 


CTTTTCCTGC 


AGGGGTTGAG 


AGCTTTCTGC 


CTTAGCCTAC 


CATGTGAAAC 


3137 


TCTACCCTGA 


AGAAAGGGAT 


G GAT AG G AAG 




T^T^TTAC ^i- 1 


n r rr* r rr , r ,r T'rT^r^ 

\j x ^_ 1 ^ v„ ± \. - \_ 


3137 


CTA CTCTGCC 


CCCTAAGCTG 


GCTGTACCTG 


TTCCTCCCCC 


ATAAAATGAT 


CCTGCCAATC 


3257 


TAAAAAAAAA 


A 










3268 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(iij MOLECULE TYPE: protein 



(xi'< SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
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Met Ala Gin Trp Glu Met Leu Gin Asn Leu Asp Ser Pro Phe Gin Asp 

Gin Leu Hxs Gin Leu Tyr Ser His Ser Leu Leu Pro Val Asp He Arg 

20 ^ 
Gin Tyr Leu Ala Val Trp He Glu Asp Gin Asn Trp Gin Glu Ala Ala 

35 40 
Leu Gly Ser Asp Asp Ser Lys Ala Thr Met Leu Phe Phe His Phe Leu 

50 55 
Asp Gin Leu Asn Tyr Glu Cys Gly Arg Cys Ser Gin Asp Pro Glu Ser 

L eu Leu Leu Gin His Asn Leu Arg Lys Phe Cys Arg Asp He Gin Pro 

85 yU 
Phe Ser Gin Asp Pro Thr Gin Leu Ala Glu Met He Phe Asn Leu Leu 

100 1Ub 
Leu Glu Glu Lys Arg He Leu lie Gin Ala Gin Arg Ala Gin Leu Glu 



115 



120 



Gl„ Gly Glu Pro Val Leu Glu Thr Pro Val Glu Ser Gin Gin His Glu 

130 135 
He Glu Ser Arg He Leu Asp Leu Arg Ala Met Met Glu Lys Leu Val 
145 150 155 

Lys Ser lie Ser Gin Leu Lys Asp Gin Gin Asp Val Phe Cys Pte Arg 
165 170 

Tyr Lys lie Gin Ala Lys Gly Lys Thr Pro Ser Leu Asp Pro His Gin 

180 185 
Thr Lys Glu Gin Lys lie Leu Gin Glu Thr Leu Asn Glu Leu Asp Lys 

195 200 
A rg Arg Lys Glu Val Leu Asp Ala Ser Lys Ala Leu Leu Gly Arg Leu 

210 215 
Thr Thr Leu He Glu Leu Leu Leu Pro Lys Leu Glu Glu Trp Lys Ala 
225 230 235 

Gin Gin Gin Lys Ala Cys lie Arg Ala Pro He Asp His Gly Leu Glu 

245 " u 

Gin Leu Glu Thr Trp Phe Thr Ala Gly Ala Lys Leu Leu Phe His Leu 
260 265 



Arg Gin Leu Leu Lys Glu Leu Lys Gly Leu Ser Cys Leu Val Ser Tyr 

27 5 28 u 

Gin Asp Asp Pro Leu Thr Lys Gly Val Asp Leu Arg Asn Ala Gin Val 

290 295 
Thr Glu Leu Leu Gin Arg Leu Leu His Arg Ala Phe Val Val Glu Thr 
305 310 

Gin Pro Cys Met Pro Gin Thr Pro His Arg Pro Leu He Leu Lys Thr 

325 330 
Gly Ser Lys Phe Thr Val Arg Thr Arg Leu Leu Val Arg Leu Gin Glu 



Gly Asn 



325 

Val Arn Thr . 
340 ~ 345 

Glu Ser Leu Thr Vai Giu vdi s^i ne Ao F /u- — - 
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355 360 



G1 „ Leu Gin Gly P»« «. *. - -> "J S " ™ ^ ^ 
Th r Z ™ «. Glu Lys «y om Ser Gin Gly Leu II. Trp ASP Phe 

"y Tyr Leu Tnr Leu Z «u Gin «, Ser Gly Cly Ser Gly Lys «y 

405 

S.r Asn Lys Sly Pro Leu Gly V,! n * =1" - « is »« 0 ^ 

Phe ffit V.! Zl Tyr Thr Tyr Gin Gly Leu Lys =1" «« Thr 

435 

Asp * Leu Pro v.1 v.1 ne lie ser Asn Ma, As; Gin Leu Ser He 
Ala "p Ala Ser Val Leu Trp Phe As„ Leu Leu Ser Pro Asn Leu Gin 
I" Gin Gin Phe Phe sir Asn Pro Pro Lys »1. Pro Trp Ser Leu Leu 
Gly Pro Ala Leu S« Trp Gin Ph. Ser Ser Tyr V.! G!y Arg Gly Leu 

M „ ser Asp Gil Leu Ser »er Leu Ar g Asn Lys Leu Phe G ly Gin Asn 

515 

cys gg Thr Glu Asp Pro Leu Leu Ser Trp Ma Asp Phe Thr Lys Ar 9 

Glu Z Pro Pro Gly Lys Leu Pro Phe Trp Thr Trp Leu Asp Lys lie 

545 550 , R 

o c- ion t v c asd Leu Trp Asn Asp Gly Arg 
Leu Glu Leu Val His Asp His Leu Lys Asp ue y ^ 

565 

U. „e t Gly Phe v=l ser ™ ser Oln Glu Ar g Ar 9 Leu Leu Lys Lys 
T „r M er ser Gly mr Ph. Leu Leu «, Phe Ser Glu ser Ser Glu Gly 

al y II. Thr cys ser Trp Val Glu His G ln Asp Asp Asp Lys v.1 Leu 

610 bIS 
Iie Tyr Ser Val Gin Pro Tyr Thr Lys Glu Val Leu Gin Ser Leu Pro 

Z Thr Glu lie lie Z ^ Tyr Gin Leu Leu Thr Glu Glu Asn Xle 

645 

Pro Glu A,„ Pro Leu Ar, Pne Leu Tyr Pro Ar 9 II. Pro Arg Asp Glu 

„. P„e B1 y Z Tvr Tyr Gin Glu Lys V,! Asn Leu Gin Clu Ar S Ar g 

67 5 

u= Aro Leu lie Val Val Ser Asn Arg Gin Val Asp 
Lys Tyr Leu Lys His Arg Leu lie * ^ 

690 

, t,,o p™ Glu Pro Glu Leu Glu Ser 
^u Leu Gin Gin Pro Leu .ci ^ ^~ ^ 720 



Gil 

70C 
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,t_i r>, — rin Dm o^u Leu Ser Leu Asp 
Leu Glu Leu Glu Leu Giy Leu vdx Pxu Gl~ ^ u^u - 

725 /JU 

Leu Glu Pro Leu Leu Lys Ala Gly Leu Asp Leu Gly Pro Glu Leu Glu 
740 745 /bU 

Ser Val Leu Glu Ser Thr Leu Glu Pro Val He Glu Pro Thr Leu Cys 

755 7 60 76b 

Met Val Ser Gin Thr Val Pro Glu Pro Asp Gin Gly Pro Val Ser Gin 

770 775 780 

Pro Val Pro Glu Pro Asp Leu Pro Cys Asp Leu Arg His Leu Asn Thr 
785 7 90 795 

Glu Pro Met Glu lie Phe Arg Asn Cys Val Lys lie Glu Glu lie Met 



805 



Pro Asn Gly Asp Pro Leu Leu Ala Gly Gin Asn Thr Val Asp Glu Val 



820 

Tyr val Ser Arg Pro Ser His Phe Tyr Thr Asp Gly Pro Leu Met Pro 
Y 835 8 40 R45 



Ser Asp Phe 
850 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3943 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Human Stat9l 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(Bi LOCATION: 197.. 2449 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATTAAACCTC TCGCCGAGCC CCTCCGCAGA CTCTGCGCCG GAAAGTTTCA TTTGCTGTAT 

GCCATCCTCG AGAGCTGTCT AGGTTAACGT TCGCACTCTG TGTATATAAC CTCGACAGTC 

TTGGCACCTA ACGTGCTGTG CGTAGCTGCT CCTTTGGTTG AATCCCCAGG CCCTTGTTGG 

rPCACAAGGT GGCAGG ATG TCT CAG TGG TAC GAA CTT CAG CAG CTT GAC 
GGCACAAGGT feOLAW, Alb ^ ^ ^ ^ ^ ^ ^ ^ ^ Asp 

5 10 



60 
120 

ieo 

229 



277 
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s si s s as s s as s ™ si s| si s ss 

SS Sj S £S E S SS S JS S S S S S S5 E 



80 



SS K S ^ Si £ £ s as s s ss ss S £ 

g£5^SiSiSK5SSSSSS|SIS!S 

SS S 5 S S iE S SS S S g S S£ E SS 

13 0 J- j -j 

ES2E!i£E£^E™S-- M i 
SSSSKSi^SsSSSISSSiJSSiiSS 

Si £ if S SS E Si si as ss S 2S 52 IS s £ 
S S? S £ S £ S £ Si S & 1 - iS ?li S5 
IS S E - £ Si S S SI S g 2S Si S 5 Si 

22 0 22 5 

2i ?5 IS S? 32 S £ ss i ^ s; § s 
s s s ss S s SS Sj SS S 1= s s s 

_ «_ rrr rA n pag CTT AAA MG TTG GAG GAA TTG 

ss s ss as si sit s as as e» ^ l. ; oiu «« «. 



27C 



m-rr- n?v7\ at pat ATC ACA AAA AAC AAA CAA 

GAA GAG AAA TAG ACC TAG GAA .AT GA G. A^ ^ 

Glu Gin Lys Tyr Thr Tyi G.u His Asp rro ± 

265 z9U 



325 



373 



421 



469 



517 



565 



613 



661 



709 



757 



805 



853 



901 



949 



997 



104b 



1093 



wo 86 p<™s»««m> 



SI? S SS JS S S S S 2I £ §E SE 2S K SS f ; 

300 305 

Trr TTT GTG GTG GAA AGA CAG CCC TGC ATG CCA ACG CAC CCT CAG AGG 

Ter lie vll S3 Arg Gin Pro Cys Met Pro Thr Hxs Pro Gin Arg 

320 325 ^ 

err PTC GTC TTG AAG ACA GGG GTC CAG TTC ACT GTG AAG TTG AGA CTG 
Pro LeS vll SS Thr Gly Val Gin Phe Thr Val Lys Leu Arg Leu 
340 J * 3 



335 



TGT GCA CGA TGG GCT CAG CTT TCA GAA GTG CTG AGT TGG CAG TTT TCT 
Ss IS Arg irp Ala Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser 

x 500 50b 



495 



1141 



lies 



1237 



1285 



1333 



1381 



TTr CTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA 
SS val Lyi G^ Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu 

350 355 360 

■TTT GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG 
SZ Asp Asp Sal Asn Glu Arg Asn Thr Val Lys Gly Phe Arg Lys 
365 370 375 

TTC AAC ATT TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC 
™ US ne £eu Gly Thr His Thr Lys Val Met Asn Met Glu Glu Ser 

380 385 390 

ACC AAT GGC AGT CTG GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA 142 9 

Thr 5£n Gly Ser Leu Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu 
400 405 

CAG AAA AAT GCT GGC ACC AGA ACG AAT GAG GGT CCT CTC ATC GTT ACT 1477 
5ln Lys Asn Ala Gly Thr Arg Thr Asn Glu Gly Pro Leu lie Val Thr 
415 42 0 

GAA GAG CTT CAC TCC CTT AGT TTT GAA ACC CAA TTG TGC CAG CCT GOT 1525 
gK Glu III His Ser Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly 
430 435 440 

TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC GTT GTG GTG ATC TCC 1573 
Leu Val He Asp Leu Glu Thr Thr Ser Leu Pro Val Val Val lie Ser 
445 450 455 

AAC GTC AGC CAG CTC CCG AGC GGT TGG GCC TCC ATC CTT TGG TAC AAC 1621 
Sn Wl Ser Gin Leu Pro Ser Gly Trp Ala Ser lie Leu Trp Tyr Asn 
460 465 470 475 

ATG CTG GTG GCG GAA CCC AGG AAT CTG TCC TTC TTC CTG ACT CCA CCA 166 S 

SI? Zlu Val Ala Glu Pro Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro 
460 485 * su 



1717 



1765 



1813 



TCT GTC ACC AAA AGA GGT CTC AAT GTG GAC CAG CTG AAC ATG TTG GGA 
Ser Val Thr Lys Arg Gly Leu Asn Val Asp Gin Leu Asn Met Leu Gly 
510 "5 520 

TAC AAG CTT CTT GGT CCT AAC GCC AGC CCC GAT GGT CTC ATT CCG TGG 
Glu £y*s lH Leu Gly Pro Asn Ala Ser Pro Asp Gly Leu lie Pro Trp 
s y 2S 530 535 

ACG AGG TTT TGT AAG GAA AAT ATA AAT GAT AAA AAT TTT CCC TTC TGG - 1861 

?hr Arg Phe Cys Lys Glu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp 

540 545 550 555 

CTT TGG ATT GAA AGC ATC CTA GAA CTC ATT AAA AAA CAC CTG CTC CCT 190 9 

l.eu Trp lie Glu Ser lie Leu Glu Leu lie Lys Lys His Leu Leu Pro 

b6C 565 ->■<■> 



4 <S # 1 
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^ „_ ,~ r rrr rpp ATC AGC AAG GAG CGA GAG 

S S - S S? SS S S? IS ™ x« - ss « g «» 

COT CCC CTG Z JM «C C« « ^ « «C TTC CTG CTG CGG TTC 

Arg Ala Leu Leu Lys Asp Gin Gin fro uxy ^ 
590 byb 

^ ^tvtv rrr rrr ATC ACA TTC ACA TGG GTG GAG CGG 

£ SZ Kt E SS ™ S S & ™ TIP v« «, 

605 610 

S S°. £S & S5 Si S S IS Si S 55 SS £ S | 

* 640 645 

655 660 

?» S i£ S S K S SK S S S5 £ % S S S 

s ^ 5 s s as s s s; s s sj e 

685 690 

™ »,r »rr PAP TTG ATT TCT GTG TCT GAA GTT CAC CCT TCT 

SS £ ne £ S SS ™ S -r «i Gl " ™> His p ™ ??I 

700 705 

„ ^ _ n rTP rrr ATG TCT CCT GAG GAG TTT 

AGA CTT CAG ACC ACA GAC AAC CTG CTC CCC ATb LLi 
Arg Leu Gin Thr Thr Asp Asn Leu Leu Pro Met Ser ^ 
720 /Zb 

« *™ r^rr rrr TrT GTA GAA TTC GAC AGT ATG ATG 

S as - s s ™ ™ S H; ™ ° lu phe Mp ™ Met 

735 74U 
AAC ACA GTA TAGAGCATGA ATTTTTTTCA TCTTCTCTGG CGACAGTTTT 
Asn Thr val 

CCTTCTCATC TGTGATTCCC TCCTGCTACT CTGTTCCTTC ACATCCTGTG TTTCTAGGGA 
^TGAAAGAA AGGCCAGCAA ATTCGCTGCA ACCTGTTGAT AGCAAGTGAA TTTTTCTCTA 
ACTCAGAAAC ATCAGTTACT CTGAAGGGCA TCATGCATCT TACTGAAGGT AAAATTGAAA 
GGCATTCTCT GAAGAGTGGG TTTCACAAGT GAAAAACATC CAGATACACC CAAAGTATCA 
GGACGAGAAT GAGGGTCCTT TGGGAAAGGA GAAGTTAAGC AACATCTAGC AAATGTTATG 
CATAAAGTCA GTGCCCAACT GTTATAGGTT GTTGGATAAA TCAGTGGTTA TTTAGGGAAC 
TGCTTGACGT AGGAACGGTA AATTTCTGTG GGAGAATTCT TACATGTTTT CTTTGCTTTA 
AGTGTAACTG GCAGTTTTCC ATTGGTTTAC CTGTGAAATA GTTCAAAGCC AAGTTTATAT 
ACAATTATAT CAGTCCTCTT TCAAAGGTAG CCATCATGGA TCTGGTAGGG GGAAAATGTG 
-AT~TTATTA CATCTTTCAC ATTGGCTATT TAAAGACAAA GACAAATTCT GTTTCTTGAG 



1957 
2005 
2053 
2101 

2149 

2197 

2245 
2293 
2341 

2389' 
2437 
2486 

2546 

2 ft 0 6 
2666 
2726 
2786 
2846 
2906 
2966 
3026 
3086 



PCI7US94/10849 

WO 95/08629 88 

AAGAGAACAT TTCCAAATTC ACAAGTTGTG TTTGATATCC AAAGCTGAAT ACATTCTGCT 
TTCATCTTGG TCACATACAA TTATTTTTAC AGTTCTCCCA AGGGAGTTAG GCTATTCACA 
AC C ACT C ATT CAAAAGTTGA AATTAACCAT AGATGTAGAT AAACTCAGAA ATTTAATTCA 
TGTTTCTTAA ATGGGCTACT TTGTCCTTTT TGTTATTAGG GTGGTATTTA GTCTATTAGC 
CACAAAATTG GGAAAGGAGT AGAAAAAGCA GTAACTGACA ACTTGAATAA TACACCAGAG 
ATAATATGAG AATCAGATCA TTTCAAAACT CATTTCCTAT GTAACTGCAT TGAGAACTGC 
ATATGTTTCG CTGATATATG TGTTTTTCAC ATTTGCGAAT GGTTCCATTC TCTCTCCTGT 
ACTTTTTCCA GACACTTTTT TGAGTGGATG ATGTTTCGTG AAGTATACTG TATTTTTACC 
TTTTTCCTTC CTTATCACTG ACACAAAAAG TAGATTAAGA GATGGGTTTG ACAAGGTTCT 
TCCCTTTTAC ATACTGCTGT CTATGTGGCT GTATCTTGTT TTTCC^CTAC TGCTACCACA 
ACTATATTAT CATGCAAATG CTGTATTCTT CTTTGGTGGA GATAAAGATT TCTTGAGTTT 
TGTTTTAAAA TTAAAGCTAA AGTATCTGTA TTGCATTAAA TATAATATCG ACACAGTGCT 
TTCCGTGGCA CTGCATACAA TCTGAGGCCT CCTCTCTCAG TTTTTATATA GATGGCGAGA 
ACCTAAGTTT CAGTTGATTT TACAATTGAA ATGACTAAAA AACAAAGAAG ACAACATTAA 
AAACAATATT GTTTCTA 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 750 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

( Xi > SEQUENCE DESCRIPTION: SEQ ID NO:4: 
Met Ser Gin Trp ^ Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu Glu 

1 5 
Gin val His Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu lie Arg Gin 

Ty r L eu Ala Gin Trp Leu Glu Lys Gin Asp Trp Glu His Ala Ala Asn 

Asp Val Ser Phe Ala Thr lie Ar, Ph= His Asp Leu Leu Ser Gin Leu 

50 bb 
Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 

His Asn lie Ar g Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 

B5 yU 
Asp Pro lie Gin Me, Ser Met He lie Tyr Ser Cys Leu Lys Glu Glu 

Arg Lys lie Leu Glu Asn Ala Gin Ar g Phe Asn Gin Ala Gin Ser Gly 

115 



3146 

3206 

3266 

3326 

3386 

3446 

3506 

3566 

3626 

3686 

3746 

3806 

3866 

3926 

3943 
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Asn lie Gin Ser Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser 
Lys val Arg Asn Val Lys Asp Lys Val Met Cys He Glu His Glu lie 

Z Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 

* 165 i/ 

n-x u -c Hu Thr Asn Gly Val Ala Lys Ser Asp Gin 
Leu Gin Asn Arg Glu His Glu Tnr Asn bxy ^ Q 

180 10:3 

two Tvq Met Tvr Leu Met Leu Asp Asn 
Lys Gin Glu Gin Leu Leu Leu Lys Lys Met xy ^ 

195 ZUU 

Lys w . gi» v.i »! »i| w ii. «• «» -J "» M " val ^ 

Glu «1 Thr Gin Asn Ma Leu He Asn Asp Glu Leu V.I Glu Trp Lys 

22 5 23 0 

Arg ^ Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys Leu 
Asp G m Leu Gin Asn Trp Phe Thr lie Val Ala Glu Ser Leu Gin Gin 
Val Arg Gin Z Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Tyr Thr 
Tyr Glu Z Asp Pro He Thr Lys Asn Lys Gin Val Leu Trp Asp Arg 
Thr Zl Ser Leu Phe Gin Gin Leu He Gin Ser Ser Phe Val Val Glu 
Tg Gin Pro Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 
Thr Cly Val Gin Phe Thr Val Lys Leu Arg Leu Leu Val Lys Leu Gin 
Glu Leu Asn ^ Asn Leu Lys Val Lys Val Leu Phe Asp Lys Asp Val 
A sn Glu Z Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Gly 
Thr Ts Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 

Z Ala Glu Phe Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 

405 * 1U 



Tnr Arg Thr Asn Glu Gly Pro Leu lie Val Thr Glu Glu Leu His Ser 

Leu Ser Phe Z Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 

Giu Thr Thr Ser Leu Pro Val Val Val He Ser Asn Val Ser Gin Leu 
450 4b5 

-i „ -re, Tm Tvr Asn Met Leu Val Ala Glu 
Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn we ^ 

465 470 
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* L eu Ser Phe Phe Leu Thr Pro Pro Cys Ma Arg Trp Ala 

Pro Arg Asn Leu Ser me 490 

Gln LM s „ ou ; al « « « «. - - - - z *■ K9 

Gly „ " -p - - s teu Gly 010 - M Leu Gly 

515 T ti p Pro Trp Thr Arg Phe Cys Lys 

Pro Asn Ala Ser Pro Asp Gly Leu He Pro ^ 

A BD Lys Asn Phe Pro Phe Trp Leu Trp He Glu Ser 
Glu Asn He Asn Asp Lys Asn v 555 

545 t^, cm Leu Trp Asn Asp Gly 

lle ,eu Glu Leu He Lys Lys His Leu Leu Pro 

^ ne Met Gly 2 ne Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 
Asp Gin Gin Z Gly Thr Phe Leu Leu Arg Phe Ser Olu Ser Ser A, 

Bh Thr Trp val Glu Arg Ser Gin Asn Gly Gly 
Glu Gly Ala lie Thr Phe Thr Trp Val ^ 

a v Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu Ser 

Glu Pro Asp Pne His £xa 635 

625 Tver, Tvr Lvs Val Met Ala Ala 

. „ -rip Tie Arg Asn Tyr l^y^ _ 

Ala Val Thr Phe Pro Asp He lie a a ^ 65 5 

Glu » ne « Z » - «, Jg ~ - Wr «o « u» »p 

WB „ ». r. *. «v - a; - ». « - - «. ». - 

01u m Z - - - s pro Lys Gly Tht - 11,1 Ile LVS Thr 

Glu Z ne s„ v.> - - - «■ » ?fs - G1 " ™ 

705 M , ser Pro Glu Glu Phe Asp Glu Val Ser Arg 

Asp Asn Leu Leu Pro Met Ser Pro ^ 735 

TlpV al Gly Ser Val Glu Phe Asp Ser Me, Met »n Thr Val 



(2) INFORMATION FOR SEQ ID NO 

(i , cROUENCE CHARACTERISTICS: 
U> ^A) LENGTH : 2607 base pans 
(B) TYPE: nucleic acid 
C) STRANDEDNESS : both 
(D) TOPOLOGY: unknown 

(ii ) MOLECULE TYPE: cDNA 

HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 
(Vi) ORIGINAL SOURCE: 



« * * t 



WO 95/08629 91 PCMJS94/10849 



(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 197.. 2335 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:5: 

ATTAAACCTC TCGCCGAGCC CCTCCGCAGA CTCTGCGCCG GAAAGTTTCA TTTGCTGTAT 6 0 

GCCATCCTCG AGAGCTGTCT AGGTTAACGT TCGCACTCTG TGTATATAAC CTCGACAGTC 12 0 

TTGGCACCTA ACGTGCTGTG CGTAGCTGCT CCTTTGGTTG AATCCCCAGG CCCTTGTTGG 18 0 

GGCACAAGGT GGCAGG ATG TCT CAG TGG TAC GAA CTT CAG CAG CTT GAC 22 9 

Met Ser Gin Trp Tyr Glu Leu Gin Gin Leu Asp 
1 5 10 

TCA AAA TTC CTG GAG CAG GTT CAC CAG CTT TAT GAT GAC AGT TTT CCC 277 
Ser Lys Phe Leu Glu Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro 
15 20 25 

ATG GAA ATC AGA CAG TAC CTG GCA CAG TGG TTA GAA AAG CAA GAC TGG 32 5 

Met Glu He Arg Gin Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp 
30 35 40 

GAG CAC GCT GCC AAT GAT GTT TCA TTT GCC ACC ATC CGT TTT CAT GAC 37 3 

Glu His Ala Ala Asn Asp Val Ser Phe Ala Thr He Arg Phe His Asp 
45 50 55 

CTC CTG TCA CAG CTG GAT GAT CAA TAT AGT CGC TTT TCT TTG GAG AAT 421 
Leu Leu Ser Gin Leu Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn 
60 65 70 75 

AAC TTC TTG CTA CAG CAT AAC ATA AGG AAA AGC AAG CGT AAT CTT CAG 46 9 

Asn Phe Leu Leu Gin His Asn He Arg Lys Ser Lys Arg Asn Leu Gin 
80 85 90 

GAT AAT TTT CAG GAA GAC CCA ATC CAG ATG TCT ATG ATC ATT TAC AGC 517 
Asp Asn Phe Gin Glu Asp Pro He Gin Met Ser Met He lie Tyr Ser 
95 100 105 

TGT CTG AAG GAA GAA AGG AAA ATT CTG GAA AAC GCC CAG AGA TTT AAT 56 5 

Cvs Leu Lys Glu Glu Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn 
Y 110 115 120 

CAG GCT CAG TCG GGG AAT ATT CAG AGC ACA GTG ATG TTA GAC AAA CAG 613 
Gin Ala Gin Ser Gly Asn He Gin Ser Thr Val Met Leu Asp Lys Gin 
125 130 135 

AAA GAG CTT GAC AGT AAA GTC AGA AAT GTG AAG GAC AAG GTT ATG TGT 661 
Lys Glu Leu Asp Ser Lys Val Arg Asn Val Lys Asp Lys Val Met Cys 
14 o 145 150 155 

ATA GAG CAT GAA ATC AAG AGC CTG GAA GAT TTA CAA GAT GAA TAT GAC 70 9 

He Glu His Glu He Lys Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp 
160 165 170 

TTC AAA TGC AAA ACC TTG CAG AAC AGA GAA CAC GAG ACC AAT GGT GTG 7 57 

Phe Lys Cvs Lys Thr Leu Gin Asn Arg Glu His Glu Thr Asn Gly Val 
175 180 185 

GCA AAG AGT GAT CAG AAA CAA GAA CAG CTG TTA CTC AAG AAG ATG TAT 
Ala Lvs Ser Asp Gin Lys Gin Glu Gin Leu Leu Leu Lys Lys Met Tyr 

J - ^. „ inn 



305 



190 
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TTA ATG CTT GAC AAT AAG AGA AAG GAA GTA GTT CAC AAA ATA ATA GAG 



T Ji S 25 25 J2i ^ Arg £s gTG v a i vai hi. Lys ne ne ciu 

205 210 21- 

TTP CTG AAT GTC ACT GAA CTT ACC CAG AAT GCC CTG ATT AAT GAT GAA 
Hu Su i£ vS Thr Glu Leu Thr Gin Asn Ala Leu lie Asn Asp Glu 



220 



225 230 235 



PAP AGT CTG CAG CAA GTT CGG CAG CAG CTT AAA AAG TTG GAG GAA TTG 
SS sfr Hu G^ Sal Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu 
270 275 280 



H U 'J 



445 



8^3 



901 



CTA GTG GAG TGG AAG CGG AGA CAG CAG AGC GCC TGT ATT GGG GGG CCG 94 9 

25 Val Olu Trl Lys Arg Arg Gin Gin Ser Ala Cys lie Gly Gly Pro 
240 245 

rrr RAT GCT TGC TTG GAT CAG CTG CAG AAC TGG TTC ACT ATA GTT GCG 
Pro Asn S Ss 2S £p Gin Leu Gin Asn Trp Phe Thr lie Val Ala 
255 260 2t>b 



997 



1045 



1093 



1141 



1189 



1237 



TAA CAG AAA TAC ACC TAC GAA CAT GAC CCT ATC ACA AAA AAC AAA CAA 

Su" Lys Tyr Thr Tyr Glu His Asp Pro lie Thr Lys Asn Lys Gin 

285 290 295 

___ Trr rip CGC ACC TTC AGT CTT TTC CAG CAG CTC ATT CAG AGC 

?£ ™ Trt Asp S t£ Phe Ser Leu Phe Gin Gin Leu lie Gin Ser 

300 305 310 

_„ _ „ Tr rTG GAA AGA CAG CCC TGC ATG CCA ACG CAC CCT CAG AGG 

sS We Val Sal G^ Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg 

320 325 JJ 

rrr PTC PTP TTG AAG ACA GGG GTC CAG TTC ACT GTG AAG TTG AGA CTG 
Pro 25 vll leu tS Gly Val Gin Phe Thr Val Lys Leu Arg Leu 

335 340 

TTG GTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA 128 5 

Hu Sal L^ Ho Gin Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu 
350 355 360 

TTT GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG 
Se Asp Ss Asp Val Asn Glu Arg Asn Thr Val Lys Gly Phe Arg Lys 
365 370 375 

TTC AAC ATT TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC 1381 
lie Sn He Leu Gly Thr His Thr Lys val Met Asn Met Glu Glu Ser 
380 385 3yu 

ACC AAT GGC AGT CTG GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA 
T h r Un Gly Ser Leu Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu 



1333 



1429 



1477 



CAG AAA AAT GCT GGC ACC AGA ACG AAT GAG GGT CCT CTC ATC GTT ACT 
Gin Lys Asn Ala Gly Thr Arg Thr Asn Glu Gly Pro Leu lie Val Thr 
415 420 ^ b 

r*A CAC CTT CAC TCC CTT AGT TTT GAA ACC CAA TTG TGC CAG CCT GGT 152 5 

GtS tin III Sts Sir Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly 
430 435 44 

TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC GTT GTG GTG ATC TCC 157 3 

llu Val lYe Asp Leu Glu Thr Thr Ser Leu Pro Val Val Val lie Ser 

450 455 



AAC GTC AGC CAG CTC CCG AGC GGT TGG GCC TCC ATC CTT TGG TAC AAC 16,, 
ten Sal Ser Gin Leu Pro Ser Gly Trp Ala Ser lie Leu Trp Tyr Asn 
460 4 " 470 4 ' 5 



PCT/US94/10849 

WO 95/08629 93 



S SS ?S £ Si SS S iS 2S | SS SS SS S S| S* 



495 



£ ss s E s & ss ss ss ss as s SS s™ ss SJ 



510 515 



SS £| SS SS S3 £ SS S SE SS SS i SS SS SS S 

540 545 550 

S5KffiESSK!2SSSSSS2SE 

s s - i 5 ss ss s sj ss ss s ss as s ss 
s ss ss ss ss as |» ss ss ?ss ss ss ss ss ss 
ss as tss s ss sj ss S ss s ss s s ?s as ss 

610 

ss as ss s ss as s ss ss ss ss si si ss ss f s 

625 630 



ss si si ss SS SI !5 s ss s ss ss s ss SS 



640 



- ?e ss s s as is ss |s as ss ss ss sj ss ss 
ST S s ss - ss ss ss ss s? ss ss ss ss a 

670 b7D 

S 5S ss s £ ss s s si s s i So K S £ 



GGA TAT ATC AAG ACT GAG TTG ATT TCT GTG TCT GAA GTG TAAGTGAACA 
Gly Tyr lie Lys Thr Glu Leu lie Sei Val Ser Glu vai 
* 705 710 



700 



CAGAAGAGTG ACATGTTTAC AAACCTCAAG CCAGCCTTGC TCCTGGCTGG GGCCTGTTGA 
AGATGCTTGT ATTTTACTTT TCCATTGTAA TTGCTATCGC CATCACAGCT GAACTTGTTG 
AGATCCCCGT GTTACTGCCT ATCAGCATTT TACTACTTTA AAAAAAAAAA AAAAAGCCAA 
AAACCAAATT TGTATTTAAG GTATATAAAT TTTCCCAAAA CTGATACCCT TTGAAAAAGT 



1669 

1717 

1765 

1813 

1861 
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2053 

2101 

2149 
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2293 

2342 

2402 
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ATAAATAAAA TGAGCAAAAG TTGAA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 712 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
Met Ser Gin Trp TVr Qlu I*u Gin Gin Leu Asp Ser Lys Phe Leu Glu 
1 * 10 

^ - „ c pr Phe Pro Met Glu lie Arg Gin 

Gin val His Gin Leu Tyr Asp Asp Ser Phe Fro me ^ 

20 ^ 
Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Tr P Glu His Ala Ala Asn 

35 40 
Asp Val Ser Phe Ala Thr lie Arg Phe His Asp Leu Leu Ser Gin Leu 

50 55 
Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 

His Asn lie Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 

B5 yu 

Asp Pro lie Gin Met Ser Met He lie Tyr Ser Cys Leu Lys Glu Glu 

100 105 

Arg Lys lie Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Ser Gly 

115 120 

Asn lie Gin Ser Thr val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser 

Lys val Arg Asn Val Lys Asp Lys Val Met Cys lie Glu His Glu lie 
145 "0 " 5 

Lys Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 

Y 165 1/U 

L eu Gin Asn Arg Glu His Glu Thr Asn Gly Val Ala Lys Ser Asp Gin 

180 1Bb 

_ . . OM Tp .. t vs Met Tvr Leu Met Leu Asp Asn 

Lys Gin oiu -u-u u 2Q5 

Lys Arg Lys Glu Val Val His Lys He He Glu Leu Leu Asn Val Thr 

Glu Leu Thr Gin Asn Ala Leu lie Asn Asp Glu Leu Val Glu Trp Lys 

225 230 

^ r ^ t-- pwq Hp Glv Glv Pro Pro Asn Ala Cys Leu 
Arg Arg Gin Gin Ser Aia Cys lie biy uiy r ^ 

Asp Gin Leu Gin Asn Trp Phe Thr lie Val Ala Glu Ser Leu Gin Gin 

260 2d5 

v - *™ -r — Le<- ^.ys Leu Glu Glu Leu Glu Gin Lys Tyr Thr 

vai Arg o^w Le^ _ 285 



27c 280 
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n tio Thr tvs Asn Lys Gin val Leu Trp Asp Arg 
Tyr Glu His Asp Pro He Thr Lys Asn uy ^ 

290 2yb 
T hr Phe Ser Leu Phe Gin Gin Leu lie Gin Ser Ser Phe Val Val Glu 

Z Qln - Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 



Thr Gly V.l Gin Phe Thr Val Lys Leu Arg Leu Leu Val Lys Leu Gin 

340 J4b 
G lu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu Phe Asp Lys Asp Val 

Asn Glu Z A- Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Gly 

Thr Zs Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 

Z Ala Glu Phe Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 

405 41U 
Thr Arg Thr Asn Glu Gly Pro Leu II. Val Thr Glu Glu Leu His Ser 

,eu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 

Glu Thr Z Ser Leu Pro Val Val Val lie Ser Asn V a l Ser Gin Leu 

Pro Z Gly Trp Ala Ser He Leu Trp Tyr Asn Met Leu Val Ala Glu 

Zo Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro Cys Ala Arg Trp Ala 

Gl„ Leu Ser Glu Z Leu Ser Trp Gin Phe Ser Ser Val Thr Lys Arg 

Oly Leu Asn Zl Asp Gin Leu Asn Met Leu Gly Glu Lys Leu Leu Gly 

515 b/u 

P1 „ T(S11 tip pro Trp Thr Arg Phe Cys Lys 
Pro Asn Ala Ser Pro Asp Gly Leu He Pro irp ^ 

530 535 



G 
54 



lu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp Leu Trp He Glu S.r 
Ze Leu Glu Leu lie Lys Lys His Leu Leu Pro Leu Trp Asn Asp Gly 
Cys lie Met Gly Z He Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 

ASP Gin Gin Zo Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Arg 

595 

^ tip Thr Phe Thr Trp Val Glu Arg Ser Gin Asn Gly Gly 

Glu Gly Ala He Thi Phe in. y ^ 

610 615 

^-i w^i riv prn Tvr Thr Lys Lys Glu Leu Ser 
Glu Pro Asp Phe His Ala Val G.u Pro Tyi inr y ^ 



G25 
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ti« tip Ara Asn Tvr Lys Val Met Ala Ala 
Ala Val Thr Phe Pro Asp He lie Arg Asn iyi y ^ 

645 bbU 
Glu Asn lie Pro Glu Asn Pre Leu Lys Tyr Leu Tyr Pro Asn lie Asp 

bob 



660 



Lys Asp His 
675 



Ala Phe Gly Lys Tyr iyr Ser Arg Pro Lys Glu Ala Pro 



680 



Glu Pro Met Glu Leu Asp Gly Pro Lys Gly Thr Gly Tyr He Lys Thr 

- - - 695 



690 



Glu Leu He Ser Val Ser Glu Val 
705 710 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2277 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: unknown 

(ill MOLECULE TYPE: cDNA 
(iiii HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Mouse 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Murine Stat91 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B; LOCATION: 5.. 2251 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

- S S SS S S5 SS 21 SS SS SS Z S K S5 £] 



1 5 I" 



20 25 



r- B r Tnr CTC GCC CAG TGG CTG GAA AAG CAA GAC TGG GAG CAC OCT GCC 
Tyr ITu Sa Sn Ep Leu Glu Lys Gin Asp Trp Glu His Ala Ala 



35 40 



TAT GAT GTC TCG TTT GCG ACC ATC CGC TTC CAT GAC CTC CTC TCA CAG 
Asp vll ler Se Ala Thr lie Arg Phe Ms Asp Leu Leu Ser Gin 



50 



^ r*r rzr TAP AGC CGC TTT TCT CTG GAG AAT AAT TTC TTG TTG 
£u Sp Sp G?n T?r Ser Arg £e Ser Leu Glu Asn Asn Phe Leu Leu 

7 0 ° 



65 



SS SS i£ £ S £ S S 2S iE 2S SS tl S E §K 



90 

30 ^ 



49 



97 



145 



193 



241 



289 



< ? -mm • n * 
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Si SS SS SI SS S? E SS S SIS £ ^ £ 2S i SS 

as s % 1 ss Si SS SS SS s ss s ss sj ss ss 
Si SS s ss ss ss s? ss ss ss ss ss ss ss ss ss 
sti ss S ss ss ss Sj ss - s ss ss s ss ss ss 

SSSSSSSSISSSSSSS^SSSSSSSK 

160 165 "° 

£ S SS SS S| SS SJ SS SS SS % S5 S SS |S ss 
^^SSSiSsSSSSSSSiSSKSSSSSS 

ss ss SS5 5 SS SIS SS as ss s ss ss SS SS SS E 
SIS SS SS SS SS SS Sf ss ss ss ss I SS S5 ss ??s 
SS S SS SS |S SS SS SS S? |J SS SS SS ss ss 

ss ss ss SS SS sis s s sss sj ss ss SS SS SS SS 
ss ss £ ss ss ss ss ss SS SS SS SS SIS SS SS ss 



275 



. rr TaT rAG err GAC CCT ATT ACA AAA AAC AAG CAG GTG TTG TCA GAT 
™ T r ^ gC Asp Pro He Thr Lys Asr. Lys Gin Val Leu Ser Asp 



290 



PPA ACC TTC CTC CTC TTC CAG CAG CTC ATT CAG AGC TCC TTC GTG GTA 
Thr Jhe leu Lu Phe Gin Gin Leu He Gin Ser Ser Phe Val Val 

^ P h r rrr Trr ATG CCC ACT CAC CCG CAG AGG CCC CTG GTC TTG 
SS SSS SS SS SS S? -S SSr „ l6 Pro Cl» Pro Leu v.! l» 

32C 325 

™r nrT rrr rTA CAG TTC ACT GTC AAG TCG AGA CTG TTG GTG AAA TTG 
J£ 5£ G?y Sal SS Thr Val Lys Ser Ar 9 Leu Leu Val Lys Leu 

340 J45 

r*6 r B r TCG AA^ CTA TTA ACG AAA GTG AAA TGT CAC TTT GAC AAA GAT 
Gin GlS ler Asn leu Leu Thr Lys Val Lys Cys His Phe Asp Lys Asp 

— lh SI J\>ZJ 



337 



385 



433 



481 



529 



577 



625 



673 



721 



769 



317 



565 



913 



961 



1009 



3 057 
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370 3 

a rrr Arc AAC GGA AGT 

s s ss s E S! s? s S Si Si I » ™ «, s« 

s ss s s ffi S s? - ss § ?s s as ai ss 

SSgSSSSSESSSESSSS 

JE S E 5 S S S S S 5 3? K 5 = - - 

465 470 

S ^ S £ SS SS S S S |] S S S 515 S IS 

500 0UD 
«» CGT CTG « OCA « go £3 KC .TO CTG GG» »S J» CTG £3 

Arq Gly Leu Asn Ala Asp Gin Leu ber 525 
515 ^ 

^ rrT rTT ATT CCA TGG ACA AGG TTT TGT 

S So S S S5 S 2J S S S~ «p »; «. - cv, 

530 5Jb 

iv at ttp TCC TTC TGG CCT TGG ATT GAC 

£ Si JS S JS SS & " S S S. s « «, „ ,s P 
S » 2S S £ 1 5 S S S I S 2 S £ S 

~ SS S S S i 5S S 32 1 S SI S K 2j SS 

rt m „ ntvr rTT AHA TTC AGT GAG AGC TCC 

55 S f S £ S5 ?S S SS SS £5 - - - ~ 

CGG GM «* BC £ TTC £ TGG GTG » CGG - « J* «J 

Arq Glu Gly Ala lie Thr Phe inr up 62Q 
610 bI - J 

„ r rr TA r ACG AAA AAA GAA CTT 

GGT GAA CCT GAC TTC CAT GCC GTG GAG CCC £ ACG ^ 
Gly Glu Pro Asp Phe His Ala Va, GJu 635 
625 630 



1201 



1249 



1297 



1345 



1393 



1441 



1489 



1537 



1585 



1633 



1681 



1729 



1777 



182b 



1873 



192: 



1 1 « t 
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TCA GCT GTT ACT TTC CCA GAi ATT ATT 
Ser Ala Val Thr Phe Pro Asp He He 
640 645 

GCC GAG AAC ATA CCA GAG AAT CCC CTG 
Ala Glu Asn He Pro Glu Asn Pro Leu 
660 

GAC AAA GAC CAC GCC TTT GGG AAG TAT 
Asp Lys Asp His Ala Phe Gly Lys Tyr 
675 680 

CCA GAA CCG ATG GAG CTT GAC GAC CCT 
Pro Glu Pro Met Glu Leu Asp Asp Pro 
690 695 

ACT GAG TTG ATT TCT GTG TCT GAA GTC 
Thr Glu Leu He Ser Val Ser Glu Val 
705 710 

ACA GAC AAC CTG CTT CCC ATG TCT CCA 
Thr Asp Asn Leu Leu Pro Met Ser Pro 
720 725 

CGG ATA GTG GGC CCC GAA TTT GAC AGT 
Arq He Val Gly Pro Glu Phe Asp Ser 
740 

TAAACACGAA TTTCTCTCTG GCGACA 



CGC AAC TAC AAA GTC ATG GCT 
Arg Asn Tyr Lys Val Met Ala 
650 655 

AAG TAT CTG TAC CCC AAT ATT 
Lys Tyr Leu Tyr Pro Asn He 
665 670 

TAT TCC AGA CCA AAG GAA GCA 
Tyr Ser Arg Pro Lys Glu Ala 
685 

AAG CGA ACT GGA TAC ATC AAG 
Lys Arg Thr Gly Tyr lie Lys 
700 

CAC CCT TCT AGA CTT CAG ACC 
His Pro Ser Arg Leu Gin Thr 
715 

GAG GAG TTT GAT GAG ATG TCC 
Glu Glu Phe Asp Glu Met Ser 
730 7 35 

ATG ATG AGC ACA GTA 
Met Met Ser Thr Val 
745 



1969 



2017 



206^ 



2113 



2161 



2209 



2251 



2277 



(2) INFORMATION FOR SEQ ID NO; 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 749 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:B: 

Met Ser Gin Tr P Phe Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu Glu 
1 5 10 

Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu lie Arg Gin 
20 25 

Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp Glu His Ala Ala Tyr 
35 40 b 

Asp Val Ser Phe Ala Thr He Arg Phe His Asp Leu Leu Ser Gin Leu 

55 60 



50 



Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 



70 



80 



Asp Asp 

65 

His Asn lie Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 



fi5 



90 



Asp Pro Val Gin Met Ser Met He He Tyr Asn Cys Leu Lys Glu Glu 
100 105 



110 



Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Glu Gly 



115 



* * « # 
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Asn He Gin Asn Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser 

130 135 
hys val Arg Asn Val Lys Asp Gin Val Met Cys He Glu Gin Glu lie 

[ys Thr Leu Glu Glu Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 

165 i7U 
Ser Gin Asn Arg Glu Gly Glu Ala Asn Gly Val Ala Lys Ser Asp Gin 

180 185 
Lys Gin Glu Gin Leu Leu Leu His Lys Met Phe Leu Met Leu Asp Asn 

195 zuu 
L ys Arg Lys Glu lie lie His Lys lie Arg Glu Leu Leu Asn Ser lie 

Glu Leu Thr Gin Asn Thr Leu He Asn Asp Glu Leu Val Glu Trp Lys 

225 230 

Arg Arg Gin Gin Ser Ala Cys lie Gly Gly Pro Pro Asn Ala Cys Leu 

245 250 
Asp Gin Leu Gin Thr Trp Phe Thr lie Val Ala Glu Thr Leu Gin Gin 

260 26b 
He Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Phe Thr 

275 2bU 
Tyr Glu Pro Asp Pro He Thr Lys Asn Lys Gin Val Leu Ser Asp Arg 

290 295 
Thr Phe Leu Leu Phe Gin Gin Leu lie Gin Ser Ser Phe Val Val Glu 
305 310 315 

Arg Gin Pro Cys Net Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 

325 330 

Thr Gly val Gin Phe Thr Val Lys Ser Arg Leu Leu Val Lys Leu Gin 

340 34b 
Glu Ser Asn Leu Leu Thr Lys Val Lys Cys His Phe Asp Lys Asp Val 

355 360 
Asn Glu Lys Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Gly 

370 375 
Thr His Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 
385 390 

Ala Ala Glu Leu Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 

405 410 
Asn Arg Thr Asn Glu Gly Pro Leu lie Val Thr Glu Glu Leu His Ser 

420 4 ^ J 
Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 

435 440 
Glu Thr Thr Ser Leu Pro Val val Val He Ser Asn Val Ser Gin Leu 

450 4jj 
Pro Ser Gly Trp Al- Ser He Le, Trp Tyr Asn Met Leu Val Thr Glu 
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o_ r Ph ^ nv, P j.01 1 a En Pro Pro Cvs Ala Trp Trp Ser 

Pro Arg Asn L-eu ^cr tn c Li*e i—- — -* - 495 

485 

Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser Val Thr Lys Arg 

500 505 

Gly Leu Asn Ala Asp Gin Leu Ser Met Leu Gly Glu Lys Leu Leu Gly 

515 520 ^ 

Pro Asn Ala Gly Pro Asp Gly Leu lie Pro Tr P Thr Arg Phe Cys Lys 



530 



Glu Asn lie Asn Asp Lys Asn Phe Ser Phe Trp Pro Trp He Asp Thr 
545 550 "5 

lie Leu Glu Leu He Lys Asn Asp Leu Leu Cys Leu Trp Asn Asp Gly 

565 570 
Cys lie Met Gly Phe lie Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 



580 



Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Arg 



595 



Glu Gly Ala He Thr Phe Thr Trp val Glu Arg Ser Gin Asn Gly Gly 

610 615 
Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu Ser 
625 630 635 

Ala val Thr Phe Pro Asp He He Arg Asn Tyr Lys Val Met Ala Ala 
645 650 

Glu Asn lie Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro Asn lie Asp 

660 665 
Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg Pro Lys Glu Ala Pro 

675 680 
Glu Pro Met Glu Leu Asp Asp Pro Lys Arg Thr Gly Tyr lie Lys Thr 

690 695 
Glu Leu lie Ser Val Ser Glu Val His Pro Ser Arg Leu Gin Thr Thr 
705 710 715 

Asp Asn Leu Leu Pro Met Ser Pro Glu Glu Phe Asp Glu Met Ser Arg 



725 



lie val Gly Pro Glu Phe Asp Ser Met Met Ser Thr Val 



740 745 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

dii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

(vi! ORIGINAL SOURCE : 

(A) ORGANISM: Mouse 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: splenic/thynu c 
(Bl CLONE: Murine 13sfl 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

<B) LOCATION: 34 . .2277 



PCT7US94/10849 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TGCCACTACC TGGACGGAGA GAGAGAGAGC AGC ATG TCT CAG TGG AAT CAA GTC 



1 5 



PAA CAA TTA GAA ATC AAG TIT TTG GAG CAA GTA GAT CAG TTC TAT GAT 
I™ ™n 2J Glu He Phe Leu Glu Gin Val Asp Gin Phe Tyr Asp 



GAC AAC TTT CCT ATG GAA ATC CGG CAT CTG CTA GCT CAG TGG ATT GAG 
Sp A^n Pro Met Glu lie Arg His Leu Leu Ala Gin Trp He Glu 
25 30 35 

ACT CAA GAC TGG GAA GTA GCT TCT AAC AAT GAA ACT ATG GCA ACA ATT 
S Gin Asp Trp Glu Val Ala Ser Asn Asn Glu Thr Met Ala Thr lie 



40 



45 



CTG CTT CAA AAC TTA CTA ATA CAA TTG GAT GAA CAG TTG GGG CGG GTT 
HI HI Gin Sn Leu lie Gin Leu Asp Glu Gin Leu Gly Arg Val 

60 65 

TCC AAA GAA AAA AAT CTG CTA TTG ATT CAC AAT CTA AAG AGA ATT AGA 
Ser Ly7 Glu Lys Asn Leu Leu Leu He His Asn Leu Lys Arg lie Arg 
75 80 

AAA GTT CTT CAG GGC AAG TTT CAT GGA AAT CCA ATG CAT GTA GCT GTG 
Lys Sal HI Gin fly Lys Phe His Gly Asn Pro Met His Val Ala Val 

90 ^5 

GTA ATT TCA AAT TGC TTA AGG GAA GAG AGG AGA ATA TTG GCT GCA GCC 
Val 111 Ser Asn Cys Leu Arg Glu Glu Arg Arg lie Leu Ala Ala Ala 
105 HO H5 

AAC ATG CCT ATC CAG GGA CCT CTG GAG AAA TCC TTA CAG AGT TCT TCA 
A^n ml £J He Gin Gly Pro Leu Glu Lys Ser Leu Gin Ser Ser Ser 
120 125 130 

GTT TCT GAA AGA CAA AGG AAT GTG GAA CAC AAA GTG TCT GCC ATT AAA 
vll ler Glu Arg Gin Arg Asn Val Glu His Lys Val Ser Ala He Lys 



140 



AAC AGT GTG CAG ATG ACA GAA CAA GAT ACC AAA TAC TTA GAA GAi 
A^n Ser Val Gin Met Thi Glu Gin Asp Thr Lys Tyr Leu Glu Asp Leu 
" 3 160 lob 



155 



CAA GAT GAG TTT GAC TAC AGG TAT AAA ACA ATT CAG ACA ATG GAT CAG 
Si Asp Glu Phe Asp Tyr Arg Tyr Lys Thr He Gin Thr Met Asp Gin 
170 l" 75 180 

PPT CAC AAA AAC AGT ATC CTG GTG AAC CAG GAA GTT TTG ACA CTG CTG 
S Asp Lys !En sir lie Leu Val Asn Gin Glu Val Leu Thr Leu Leu 
" 18 c 190 i9j 

rsa raA ATG CTT AAT AGT CTG GAC TTC AAG AGA AAG GAA GCA CTC AGT 
Gin Met S %l Ser Leu Asp Phe Lys Arg Lys Glu Ala Leu Ser 



54 



102 



150 



198 



246 



294 



342 



390 



438 



486 



582 



630 



678 



200 



205 
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AAG ATG ACG CAG ATA GTG AAC 
Lvs Met Thr Gin He Val Asn 
y 220 

CTT CTA GAA GAG CTG CAG GAC 
Leu Leu Glu Glu Leu Gin Asp 
235 

ATT GGT GGC CCG CTC CAC AAT 
He Gly Gly Pro Leu His Asn 
250 

ACC CTA CTG GCA GAG AGT CTT 
Thr Leu Leu Ala Glu Ser Leu 
265 270 

CTA CAG GAG CAA TCT ACT AAA 
Leu Gin Glu Gin Ser Thr Lys 
280 285 

GCT CAA AGA GCA CAC CTC CTG 
Ala Gin Arg Ala His Leu Leu 
300 

CTT TTC AAG AAC TCA TTT GTG 
Leu Phe Lys Asn Ser Phe Val 
315 

CAC CCT CAG AGG CCG ATG GTA 
His Pro Gin Arg Pro Met Val 
330 

AAA CTG AGA TTA CTA ATA AAA 
Lys Leu Arg Leu Leu lie Lys 
345 3bU 

GTA AAG GCG TCC ATT GAC AAG 
Val Lys Ala Ser He Asp Lys 
360 365 

TTT GTG CTT TGT GGA ACT CAC 
Phe Val Leu Cys Gly Thr His 
380 

TCC AAT GGG AGC CTC TCA GTG 
Ser Asn Gly Ser Leu Ser Val 
395 

GAA GTG CAG TAC TGG AGT AAA 

Glu Val Gin Tyr Trp Ser Lys 
410 

GAG GAG TTG CAT TCC ATA ACC 
Glu Glu Leu His Ser lie Thi 
425 qJU 

CTC ACC ATT AAC CTA GAG ACC 
Leu Thr lie Asn Leu Glu Thr 
440 445 

AAT GTC AGC CAA CTA CCT AAT 
Asn Val Ser Gin Leu Pro Asn 
460 

GTA TCA ACT AAC GAC TCC CAG 
Val Ser Thr Asn Asp Ser Gin 
475 



_ ^ T ~„ n mr> n*vn ATP, AGC ATG 

SS S Sp Leu Leu S« Asn Ser Met 
225 

TGG AAA AAG CGG CAC AGG ATT GCC TGC 
Trp Lys Lys Arg Has Arg lie Ala Cys 
240 

rrr rVG GAC CAG CTT CAG AAC TGC TTT 
fly Kp Gin Leu Gin Asn Cys Phe 
255 260 

TTC CAA CTC AGA CAG CAA CTG GAG AAA 
S So* Leu Arg Gin Gin Leu Glu Lys 
275 

ATG ACC TAT GAA GGG GAT CCC ATC CCT 
Met t£ Tyr Glu Gly Asp Pro He Pro 
290 



GAA AGA GCT ACC TTC CTG ATC TAC AAC 
Glu Arg Ala Thr Phe Leu He Tyr Asn 
305 31U 

GTC GAG CGA CAC GCA TGC ATG CCA ACG 
vS Gil Arg His Ala Cys Met Pro Thr 
320 3Zb 

TTT AAA ACC CTC ATT CAG TTC ACT GTA 
£J££?hr Leu lie Gin Phe Thr Val 

335 340 

TTG CCG GAA CTA AAC TAT CAG GTG AAA 
Leu Pro Glu Leu Asn Tyr Gin Val Lys 
355 

AAT GTT TCA ACT CTA AGC AAT AGA AGA 
ten Val Ser Thr Leu Ser Asn Arg Arg 
370 J/b 

nr AAA GCT ATG TCC AGT GAG GAA TCT 
SE Ly£3 £. Met Ser Ser Glu Glu Ser 

385 

TAG TTA GAC ATT GCA ACC CAA GGA GAT 
SS 55 Asp He Ala Thr Gin Gly Asp 
400 4Ub 

GGA AAC GAG GGC TGC CAC ATG GTG ACA 
Gly Asn Glu Gly Cys His Met Val Thr 

/■ -i c 2 v,- 

1 A 3 

TTT GAG ACC CAG ATC TGC CTC TAT GGC 
lie Glu Thr Gin He Cys Leu Tyr Gly 
435 

AGC TCA TTA CCT GTC GTG ATG ATT TCT 
Ser Ser Leu Pro Val Val Met lie Ser 
450 4bb 

GCA TGG GCA TCC ATC ATT TGG TAC AAT 
Sa Sp Ala Ser He lie Trp Tyr Asn 
465 4/U 

AAC TTG GTT TTC TTT AAT AAC CCT CCA 
Leu Val Phe Phe Asn Asn Pro Pro 
460 455 



726 



774 



822 



870 



918 



966 



1014 



1062 



1110 



1158 



1206 



1254 



1302 



L350 



1398 



1446 



1494 
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IS S5 £ ^ S5 Si 2S S ^ S? £? S f s Si E S 

490 495 

IS £ S5 ss s & s s s s° ss i i£ s s s 

505 510 

s 5: s s s s e as s £ I s as s s s 

S3SSSSS5SSIS52SES 

SS S, SI S SS £ SS E SJ £ 5" - i SS 5Z - 

570 575 

m rrT rnn ACA TTT TTG TTA AGA TTC 

S 2 X E JS SS ^ 22 SS SS SSr g. - - « «- 

585 590 
AGT GAG « GAT CTT GGA GGG ATA ACC TIC ACC TGG CTG GAC CAA TCT 

Ser Glu Ser His Leu Gly Gly ne inr m gl5 
600 605 

sr js s? ai |i us s as is i si s s ss Si ss 

- S S SI SS S S. S E S 5S S S 1 £ ?5 

635 64U 

™n .rr rrT GAA AAC CCT CTG AAG TAG CTC TAC CCT 

ffi S SI SI « iS p" Si X. Pro w Tjr L» TVr Pro 

650 655 

~ ^t,* ^r>r- ttt rr,r AAA CAC TAC AGC TCC CAG CCG 
2J S Pro K SS E K ^ S JS Si. Tyr ser Ser G!„ Pro 

665 670 

£ Si SE S S So £ Si - SS 25 5! S5 5? - 1 
E5SSiS5ESS3S55SaS 

700 7Ub 

™ ptt pTr rrr atg tct cca agt gca tat gct gtg 
Si sS So Si S 21 2S Pro SJ ser Pro ser Al a Tyr «. V.I 

715 72U 

™~ r-r arc CCA ACG ACA ATT GAA ACT GCA ATG AAT TCC 
25 S Glu i£ 2S Ser PrS ?S Thr Xle G,u Thr Ala Met Asn Ser 



730 735 

CCA TAT TCT GCT GAA TGACGGTGCA AACGGACACT TTAAAGAAGG AAGCAGATGA 

Pro Tyr Ser Ala Glu 
745 



1542 



1590 



1638 



1686 



1734 



1782 



1830 



1878 



1926 



1974 



2022 



2070 



2118 



2166 



2214 



2262 



2317 
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^0 TGTTCTTTAC C^TCAC AATTTATTTC n»« ™*»* 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 748 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

lxi ) SEQUENCE DESCRIPTION: SEQ ID NO:10: 
Met ser Oln Trp Asn Gin Val Oln Oln - Clu He Lys - Glu 
Gl „ v.l Asp Gin TVr Asp Asp Asn Phe Pro Me t Glu lie Arg His 
L eu Le U Ala Gin Trp He Glu Tfcr Gin Asp Trp 0 lu Val Ma Ser Asn 
Asn Glu 2 Met Ala Thr lie Leu Leu Gin Asn Leu Leu He Gin Leu 
, sp l Gin Leu Gly Arg Val Ser Lys Glu Lys Asn Leu Leu Leu lie 
»" Asn Leu Lys Arg He Ar g Lys Val Leu Gin Gly Lys Phe His Gly 

As n Pro Met His Z Ala Val Val lie Ser Asn Cys Leu Arg Glu Glu 

100 

Kg OT „. «. M . «• « »« « iu si - m tto Leu Glu 

Lys S „ 2 «. »r ser S„ v.. s« 6 1. «. »« *» V.1 

130 iJ 
a . W s v., S« Ma n. ». v.. g. « - - «» "3 

™ w. tv, «» «« »» - «» «■ ?a pte Mp AC9 & LYS 

Thr He 0l n T te « «P «y -S >*■ S " 3°. 

^ m^*. to- aqn Qf>r Leu Asp Pfte 
Gin Glu Val Leu Thr Leu Leu Gin Glu Me, .e- Asr. 

195 ^ U 

W3 «, ,v= »• - «i « TW =1 " 3! val ls ° G1 " 

ssp Hi »o H., -» S« «« «. «» «» - «» « SK 

" « ». «, nj «• »« c.y - - », g, 

isp G1 „ UU Gl„ *=» C y s T»r Lju «. »• «- »< j" "» G1 " 

260 

P i r o pr Tbi- Lys Met Thr 
Leu Arg Gin Gin Leu Glu Lys Leu G!n ..u G.n Se: Th- 



2375 
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Tyr Glu Gly Asp Pro He Pro Ma Gin Arg Ala His Leu Leu Glu Arg 

290 2yb 
Ala Thr Phe Leu He Tyr Asn Leu Phe Lys Asn Ser Phe Val Val Glu 

His Ala Cys Met Pro Thr His Pro Gin Arg Pro Met Val Leu Lys 

Thr Leu lie Gin Phe Thr Val Lys Leu Arg Leu Leu lie Lys Leu Pro 

340 34b 
Glu Leu Asn Tyr Gin Val Lys Val Lys Ala Ser lie Asp Lys Asn Val 

Ser Thr Leu Ser Asn Arg Arg Phe Val Leu Cys Gly Thr His Val Lys 

370 375 
Ma Met Ser Ser Glu Glu Ser Ser Asn Gly Ser Leu Ser Val Glu Leu 

Asp lie Ala Thr Gin Gly Asp Glu Val Gin Tyr Trp Ser Lys Gly Asn 

405 410 

Glu Gly Cys His Met val Thr Glu Glu Leu His Ser lie Thr Phe Glu 

420 4 ^ b 
Thr Gin lie Cys Leu Tyr Gly Leu Thr lie Asn Leu Glu Thr Ser Ser 

435 440 
Leu Pro val Val Met He Ser Asn Val Ser Gin Leu Pro Asn Ala Trp 

450 455 
Ala Ser He lie Trp Tyr Asn Val Ser Thr Asn Asp Ser Gin Asn Leu 
465 470 4 

Val Phe Phe Asn Asn Pro Pro Ser Val Thr Leu Gly Gin Leu Leu Glu 

485 4yU 

Val Met Ser Trp Gin Phe Ser Ser Tyr Val Gly Arg Gly Leu Asn Ser 

500 5Ub 
Glu Gin Leu Asn Met Leu Ala Glu Lys Leu Thr Val Gin Ser Asn Tyr 

515 520 
Asn Asp Gly His Leu Thr Trp Ala Lys Phe Cys Lys Glu His Leu Pro 

530 535 
Gly Lys Thr Phe Thr Phe Trp Thr Trp Leu Glu Ala lie Leu Asp Leu 
545 550 555 

lie Lys Lys His lie Leu Pro Leu Trp lie Asp Gly Tyr lie Met Gly 

Phe Val Ser Lys Glu Lys Glu Arg Leu Leu Leu Lys Asp Lys Met Pro 

580 585 
Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser His Leu Gly Gly He Thr 

595 600 
Phe Thr Trp Val Asp Gin Ser Glu Asn Gly Glu Val Arg Phe Hxs Ser 

610 615 
val Glu Pro Tyr Asn Lys Gly Arg Leu Ser Ala Leu Ala Phe Ala Asp 
625 630 



t * «r * r * 



WO 95/08629 



107 



PCT/US94/10849 



lie Leu Arg Asp Tyr Lys Val He Met Ala Glu Asn He Pro Glu Asn 

645 6bU 
Pro Leu Lys Tyr Leu Tyr Pro Asp lie Pro Lys Asp Lys Ala Phe Gly 

Lys His Tyr Ser Ser Gin Pro Cyj Glu v.l Ser Ar 9 Pro Thr Glu Ar g 

675 ddu 
Gly Asp Lys Gly Tyr Val Pro Ser Val Phe He Pro lie Ser Thr He 

Arg ser Asp Ser Thr Glu Pro Gin Ser Pro Ser Asp Leu Leu Pro Met 

Ser Pro Ser Ala Tyr Ala Val Leu Arg Glu Asn Leu Ser Pro Thr Thr 

725 730 

lie Glu Thr Ala Met Asn Ser Pro Tyr Ser Ala Glu 
740 745 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2869 base pairs 
(Bl TYPE: nucleic acid 

(C) STRAND EDNESS : both 

(D ) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

'vii) IMMEDIATE SOURCE: 

(A) LIBRARY: splenic/ thymic 

(B) CLONE: Murine 19sf6 

(IX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 69. .2378 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
r-r-PGACCA GCCAGGCCGG CCAGTCGGGC TCAGCCCGGA GACAGTCGAG ACCCCTGACT 

™ s s as S ES SS 2S S°» SS S S Si 2S £ 
SS 55 K SS SS 2SS E £ S S E SK S 

~ s s s s as £ - s £ s £ s 

3 5 

«c »c r q. j» cfT jcc » «. - Jjr eg j« crc g - 

Ala Ser Lys Gjlu Ser His Aid jjw gQ 



60 

110 

158 

206 
254 
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G AA ATT GAC <M CAA TAT AGC O0J TTC CTC CAA GAG TJC J» GTC ™ 

Glu He Asp Gin Gin Tyr Ser Arg r ?5 

TAT CAG MC CTT CGA £ «C £ ™ CTC «G J« J» ™ 

Tyr Gin His Asn Leu Arg Arg lie bys * gQ 

CTT GAG AAC CC «G GAA ATT GCC CGG £C GTG GCC « JGC CTG » 
Leu Glu Lys Pro Met Glu He Ma Arg n= ^ 110 

95 1 00 

GAA GAG TCT CGC CTC CTC CAG ACG GCA GCC ACG GCA GCC CM CAA «. 

Glu Glu Ser Arg Leu Leu Gin Tnr Aia * 125 
115 1ZU 

*™ rrr rrr GTA GTG ACA GAG AAG CAG CAG 
GGC CAG GCC AAC CAC CCA ACA GCC GCC GTA GTG ^ ^ ^ ^ 

Gly Gin Ala Asn His Pro Thr Aia Aid a i4Q 

- S & s s ss s as ss s ss as SS S ffi i 

S s as s !5 S s as "s ss as a; s a; as s 

195 ^ UU 
CTC ACA GCC CTG GAC « «S CGG AGA AGC ATT GTG AGT GAG CTG GCG 

Leu Thr Ala Leu Asp Gin Met Arg Arg t»er ^ 

ss s s S s s as ss s; as ss s s s zs as 

SKSES5SSSEKSE35SS585 

- S £S S c-S 22 E S 2S Si iE S SS S S t | 

260 



255 



^ ^r *rr rrr CAA CAA ATT AAG AAA CTG GAG GAG 
GCA GAA TCT CAA CTT CAG ACC CGC CAA CAA ATT ^ ^ ^ 

Ala Glu Ser Gin Leu Gin Thr Arg * ^ 

275 

™^ * Tip rrr r*r CCT ATC GTG CAG CAC CGG 

s as as ^ s s? % e ss £ £ ». v., « ; - 

« ATG CTG lie GAG ACG ATC GTO GAG CTG TTC ACA AAC TTA ATG AAG 

Pro Met Leu Glu G^u Arg lie ^ 315 

AGT GCC Z GTG GTG GAG CGG GAG CCC TGC ATG CCC ATG CAC CCG CAC 

Ser Ala Phe Val Val Giu Arg o,. m 
320 



302 



350 



398 



446 



494 



542 



590 



638 



686 



734 



782 



830 



678 



926 



974 



1022 



1070 



• s tr 
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Arq Pro Leu va S 345 

~ ptt AAA ATT AAA. GTG 

2S S K | S S S K £ K 2 5S - - ~ 

^ rrr rrT GCC CTC AGA GGG TCT CGG 
TGC ATT GAT AAA GAC TCT GGG GAT GTT GCT GCC ^ ^ ^ Arg 

Cys He Asp Lys Asp Ser Gly Asp ^ 380 

3 „„* rar TTC AAG CAC CTG ACC CTT AGG 

TCT AAC AAC GGC AGC CTG TCT GCA GAG TTC AAC ^ ^ ^ Arg 

Ser Asn Asn Gly Ser Leu ser *ia 4l0 

4 00 

- S - S s; S S S 5 S £ 5 5 S E = 

- s s s S s s s s s a s « s s 

ss £5 ss ss S s as s ss s ss s s ss ss 

5 - S S S S S SS SS 5! S£ £ £ = = « 

s s s s s s s s a ?s s ss s ss s i 

„r pTr Ar.r ATC GAG CAG CTG ACA 

as s s s s s ss s s i - »• - - ss Tte 

fiCG TO GCT « « CXC C» 000 «. GGT GTG JJC « J» « ™T 

ffiffiSSSWIEiSsSSSSgJSS 

~™ rr-r TAC ATC ATG GGT TTC ATC AGC 

- if. 25 ss s s - - ™ n ? - «v - »• Si 

l 3,0 CGG 0* COG GCC £ ™ ~ «» £ £ SS f - 

Lye-: Glu Arg Glu Arg Ala lie L_ :> ^ 60d 



1214 



1262 



1454 



1502 



1694 



1742 
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CTA CTG CGC TTC AGC GAG AGC AGC AAA GAA GGA GGG GTC ACT TTC ACT 1934 
Leu Leu Arg Phe Ser Glu Ser Scr Lys Glu Gly Gly Val Thr Phe Thr 
610 615 620 

TGG GTG GAA AAG GAC ATC AGT GGC AAG ACC CAG ATC CAG TCT GTA GAG 19 62 

Trp Val Glu Lys Asp lie Ser Gly Lys Thr Gin He Gin Ser Val Glu 
625 630 635 

CCA TAC ACC AAG CAG CAG CTG AAC AAC ATG TCA TTT GCT GAA ATC ATC 2 030 

Pro Tyr Thr Lys Gin Gin Leu Asn Asn Met Ser Phe Ala Glu He He 
640 645 650 

ATG GGC TAT AAG ATC ATG GAT GCG ACC AAC ATC CTG GTG TCT CCA CTT 2 078 

Met Gly Tyr Lys He Met Asp Ala Thr Asn He Leu Val Ser Pro Leu 
655 660 665 670 

GTC TAC CTC TAC CCC GAC ATT CCC AAG GAG GAG GCA TTT GGA AAG TAC 2126 
Val Tyr Leu Tyr Pro Asp He Pro Lys Glu Glu Ala Phe Gly Lys Tyr 
675 680 685 

TGT AGG CCC GAG AGC CAG GAG CAC CCC GAA GCC GAC CCA GGT AGT GCT 2174 
Cys Arg Pro Glu Ser Gin Glu His Pro Glu Ala Asp Pro Gly Ser Ala 
690 695 700 

GCC CCG TAC CTG AAG ACC AAG TTC ATC TGT GTG AC A CCA ACG ACC TGC 2222 
Ala Pro Tyr Leu Lys Thr Lys Phe He Cys Val Thr Pro Thr Thr Cys 
705 710 715 

AGC AAT ACC ATT GAC CTG CCG ATG TCC CCC CGC ACT TTA GAT TCA TTG 227 0 

Ser Asn Thr lie Asp Leu Pro Met Ser Pro Arg Thr Leu Asp Ser Leu 
720 725 730 

ATG CAG TTT GGA AAT AAC GGT GAA GGT GCT GAG CCC TCA GCA GGA GGG 2318 
Met Gin Phe Gly Asn Asn Gly Glu Gly Ala Glu Pro Ser Ala Gly Gly 
735 740 745 750 

CAG TTT GAG TCG CTC ACG TTT GAC ATG GAT CTG ACC TCG GAG TGT GCT 23 66 

Gin Phe Glu Ser Leu Thr Phe Asp Met Asp Leu Thr Ser Glu Cys Ala 
755 760 765 

ACC TCC CCC ATG TGAGGAGCTG AAACCAGAAG CTGCAGAGAC GTGACTTGAG 2418 
Thr Ser Pro Met 
770 



ACACCTGCCC 


CGTGCTCCAC 


CCCTAAGCAG 


CCGAACCCCA 


TATCGTCTGA 


AACTCCTAAC 


2478 


TTTGTGGTTC 


CAGATTTTTT 


TTTTTAATTT 


CCTACTTCTG 


CTATCTTTGG 


GCAATCTGGG 


2538 


CACTTTTTAA 


AAGAGAGAAA 


TGAGTGAGTG 


TGGGTGATAA 


ACTGTTATGT 


AAAGAGGAGA 


2598 


GACCTCTGAG 


TCTGGGGATG 


GGG CTG AG AG 


CAGAAGGGAG 


GCAAAGGGGA 


AC AC CTC CTG 


2658 


TCCTGCCCGC 


CTGCCCTCCT 


TTTTCAGCAG 


CTCGGGGGTT 


GGTTGTTAGA 


CAAGTGCCTC 


2718 


CTGGTGCCCA 


TGGCTACCTG 


TTGCCCCACT 


CTGTGAGCTG 


ATACCCCATT 


CTGGGAACTC 


2778 


CTGGCTCTGC 


ACTTTCAACC 


TTGCTAATAT 


CCACATAGAA 


GCTAGGACTA 


AGCCCAGGAG 


2838 


GTTCCTCTTT 


AAAT T AAA A A 


AAAAAAAAAA 


A 






2869 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 770 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



t. t. t * 
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(ii) MOLECULE TYPE: protein 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Met Ala Gin Trp Asn Gin Leu Gin Gin Leu Asp Thr Arg Tyr Leu Lys 

Gil Leu His Gin Leu Tyr Ser Asp Thr Phe Pro Met Glu Leu Arg Gin 

Pne Leu Ala Pro Trp lie Glu Ser Gin Asp Trp Ala Tyr Ala Ala Ser 



35 40 
Lys Glu Ser His Ala Thr Leu Val Phe His Asn Leu Leu Gly Glu He 

50 55 



ASP Gin Gin Tyr Ser Arg Phe Leu Gin Glu Ser Asn Val Leu Tyr Gin 
His Asn Leu Arg Arg He Lys Gin Phe Leu Gin Ser Arg Tyr Leu Glu 



Lys Pro Met Glu lie Ala Arg He Val Ala Arg Cys Leu Trp Glu Glu 

Ser Arg Leu Leu Gin Thr Ala Ala Thr Ala Ala Gin Gin Gly Gly Gin 

A1 a Asn His Pro Thr Ala Ala val Val Thr Glu Lys Gin Gin Met Leu 

Glu Tin His Leu Gin Asp Val Arg Lys Arg Val Gin Asp Leu Glu Gin 

Z Met Lys Val Val Glu Asn Leu Gin Asp Asp Phe Asp Phe Asn Tyr 

Ly s Thr Leu Lys Ser Gin Gly Asp Met Gin Asp Leu Asn Gly Asn Asn 

180 185 
Gin Ser Val Thr Arg Gin Lys Met Gin Gin Leu Glu Gin Met Leu Thr 

195 

Ma Leu Asp Gin Met Arg Arg Ser He Val Ser Glu Leu Ala Gly Leu 

L eu ll Ala Met Glu Tyr Val Gin Lys Thr Leu Thr Asp Glu Glu Leu 
225 230 

D ,. -rip vi = ry= He Glv Gly Pro Pro 

Ala Asp Trp Lys Arg Arg Pro x.e A.- 2 __ 
245 2bu 

t^,, pi„ iqn TrD He Thr Ser Leu Ala Glu 
Asn He Cys Leu Asp Arg Leu Glu Asn irp i-ie ^ 

260 2bb 
S.r G ln Lou Gl„ * « g Gin «n He Lys L y8 L.u Glu G lu Leu Gin 

Gln Lys Z Se, Tyr Lys Gly »P Pro »• » «; «° 



290 295 



Tl ,, al rln Leu Phe Ara Asn Leu Met Lys Ser Ala 
Leu Glu Glu Arg He Val Glu Lej pne * . 32Q 

, r Prn f'vs Met Pro Met His Pro Asp Arg Pro 

Phe Val Val Glu Arg Gin Pro Lys wei 33b 
325 
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« V.1 Ue w. ™r Gly V.1 Oin Pb. 1* T h r ,ys v.1 *J Leu «. 

val Lys Pfce Z oi, « S «" «» - SSI WS 
SSP g. Z «y "p »• »■ »■ "» M9 SS S " *™ LVS 

„ Z «« «v « * - vai « ns Glu Glu s " « ; 

Z «r »u ser «. B. *• w. His -» ™ «» *. JJ" «» 
arg Cys Gly As» Z «Y « »• -J ^ MP »" SS ^ 

420 

Tlo Thr p he Glu Thr Glu Val Tyr His Gin 
Thr Glu Glu Leu His Leu He Thr Phe faiu ^ 

435 44U 

m„ Thr His ser Leu Pro Val Val Val He 
Gly Leu Lys He Asp Leu Glu Thr Hi s Ser ^ 

Ser ll He Cys Gin Met Pro Asn Ala Trp Ala Ser He Leu Trp Tyr 
Z Met -u Thr Asn Asn Pro Lys Asn Val Asn Phe Phe Thr Lys Pro 

Pro He Gly Thr Trp Asp Gin Val Ala Glu Val Leu Ser Trp Gin Phe 

500 SU3 

riw tpu Ser He Glu Gin Leu Thr Thr Leu 
Ser Ser Thr Thr Lys Arg Gly Leu Ser ^ 

515 520 
Ala Glu Lys Leu Leu Gly Pro Gly Val Asn Tyr Ser Gly Cys Gin He 

Thr ll Ala Lys Phe Cys Lys Glu Asn Met Ala Gly Lys Gly Phe Ser 
545 550 

tip He Asd Leu val Lys Lys Tyr He 
Phe Trp Val Trp Leu Asp Asn He He Asp g?5 

5 6 5 

riv Tvr He Met Gly Phe He Ser Lys Glu 
Leu Ala Leu Trp Asn Glu Gly Tyr He Met u y ^ 

580 5US 

T1 t«,„ Qp-r Thr Lvs Pro Pro Gly Thr Phe Leu Leu 
Arq Glu Arg Ala He Leu Ser Thr Lys ^ 

„ 8 £. Z .1. S« "* SS " y IS ^ T " V " 

Z -P ~ «? - "» ?3S S " IS 

Z ,ys Gin Gl„ ,.u -n Asn *t Ser Jb. M. <U» n. He « Gly 

Thr hqti He Leu Val Ser Pro Leu Val Tyr 
Tyr Lys He Met Asp Ala Thr Asn lie ^eu ^ 

y 660 bb * 

U. Tyr Pro »p He Pro ,ys Glu M» P.e Gly Ljj. Ty. cys Ar g 

675 b 
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Pro »„ ser On CU, Hi. Pro 01» - W »~ f , Scr Ma ». Pre 

W Tel T* - - - » !S T " 

Z ne ASP «. Pro « « Pro « T* «. ASP S.r «, Hg - 

Phe By „ 5- «u «, «u - - «• '» ?S G1 " Phe 
»u s« «. * *. - % - ™ s " Glu 5S Ala Tht s " 

755 

Pro Met 
770 

(2) INFORMATION FOR SEQ ID NO: 13: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

. (C) STRANDEDNESS •- single 
(D) TOPOLOGY: unknown 

MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 

Nri \ ORIGINAL SOURCE: 

(V ] (A) ORGANISM : Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: human Stat9l 

(X ) PUBLICATION 1N ™"°?cT/US93/02569 

(H ) DOCUMENT NUMBER: ^T/Ub^/ 

(I) FILING DATE: 19 -MAR- 1993 

(xi , SEQUENCE DESCRIPTION : SEQ ID NO-.13: 

L eu Asp <Uy Pro L VS Gly T h r Oly ^ «• ^ ™ ^ S« 

{2 ) INFORMATION FOR SEQ ID NO: 14: 

t\) SEOUENCE CHARACTERISTICS: 
( X> ^ A) LENGTH : 25 base pairs 

(B ) TYPE: nucleic: acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi ) SEQUENCE DESCRIPTION: SEO ID BO: 14: 

GATCGAGATG TATTTCCCAG AAAAG 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 
1 (A) LENGTH : 15 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:15: 

1 5 

{2 ) INFORMATION FOR SEO ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 
(ii!) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
(v) FRAGMENT TYPE: internal 

Ui ) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

Gly Tyr He Lys Thr Glu 
1 5 
l2 ) INFORMATION FOR SEQ ID NO: 17: 

(i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 ammo acids 

(B) TYPE: amino acid 

(CI STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(ii, MOLECULE TYPE: peptide 

(iU) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

( v) FRAGMENT TYPE : internal 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 
Lys val Asn Leu Gin Glu Arg Arg Lys Tyr Leu Lys His 
1 5 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Glu Pro Gin Tyr Glu Glu lie Pro lie Tyr Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI - SENSE : NO 

(v) FRAGMENT TYPE: internal 

. v - - ; IMMEDIATE SOURCE: 

(B) CLONE: Src 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Waksman, et al . 

(C) JOURNAL: Nature 

(D) VOLUME: 358 

(F) PAGES: 646-653 

(G) DATE: 1992 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Ala Glu Glu Trp Tyr Phe Gly Lys He Thr Arg Arg Glu Ser Glu Arc 

l 5 ^° 

L eu Leu L«u Asn Pro Glu Asn Pro Arg Gly Thr Phe Leu Val Arg Glu 

20 2b 
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t nv Ala Tyr Cys Leu Ser Val Ser Asp Phe Phe 
ser Glu Thr Thr Lys Gly Ala Tyr ^y 45 
35 

^ *,„ Me C ly »» *■ *> W » HI- ^ ». *S « 

ftsp 1 oiy =>v - - « s « "» ?T G1 " S " » 

l ... »! «. ^ Tyr Ser LV. HJ. «» « ^ «» g- 

8 5 

Arg Leu Thr Asn Val cys Pro Thr Ser 



(2) INFORMATION FOR SEQ ID NO: 20: 

ti ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ill MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 
(iv) ANTI -SENSE: NO 
!v) FRAGMENT TYPE: internal 

(vii) IMMEDIATE SOURCE: 
(B! CLONE: Abl 

( X ) PUBLICATION INFORMATION: 

U) P ( A ) AUTHORS : Overduin et al 

(C) JOURNAL: Proc . Natl. Acad. mi. 

(D) VOLUME: 89 

(F) PAGES: 11673-11677 

(G) DATE: 1992 



,xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Gl » Lys His Ser Trp Tyr His Gly Pro Val Ser Arg Asn Ala Ala Glu 
L L eu Leu Ser Ser Gly He Asn Gly Ser Phe Leu Val Arg Glu Ser 

„ ^ « - «y «» «• IT ne s " MS « G1 " " y 

Mg V„l w Hi. Ty. W ye „n m ». »r -p «V - TV, 

= rin ser Ars Phe » Thr k. M« Glu Leu val His His 

Val Ser Ser Glu ber ary mc 75 yu 

65 ^ ^ 

Th^ Thr Leu His Tyr Pro Ala 
His Ser Thr Val Ala Asp Giy Leu He Th. Thr 95 



65 

a Q^r Thr Val 

85 

Pro Lys Arg 
t2 ) INFORMATION FOR SEQ ID NO:21: 
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m SEOUENCE CHARACTERISTICS: 
U) ^LENGTH: 102 amino acids 

ir) TYPE : amino acid 
C* STRAND EDNESS : single 

(D j TOPOLOGY: linear 

MOLECULE TYPE: peptide 
( iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
lv) FRAGMENT TYPE: internal 

(vii ) IMMEDIATE SOURCE : 
( (B ) CLONE: Lck 

(x) PUBLICATION MM^A™- 
1 (A) AUTHORS : Eck, et al. 

(C) JOURNAL: Nature 

(D) VOLUME: 362 

(F) PAGES: 87-91 

(G) DATE: 1993 

SEQUENCE MWOm- « 10 «°' 21 - G1 „ „ „ 

Ttp Phe P H« WS - - - WS ? " a MU " 

1 ti« nrn Glu Ser Glu Ser 

A1 , TO «, » ~ «• «» s - £ e Leu 11 9 » 

a „ y 1 Pne - «. - v.. « « -P ^ MP " 

Thr Ala Gly ^ er riiC 40 

65 „ r1v Leu Cys Thr Arg Leu Ser 

«i « 3 «i. * » 9 " * la ser MP » y 

Kg Pro cy S Gin Thr can 

(21 JHTORMATIOB FOR SEQ ID S0.23-. 

(to TYPE: amino acia 
r STRANDEDNESS: single 
(V) TOPOLOGY: linear 

MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(i vi ANTI- SENSE: NO 
, v ; FRAGMENT TYPE: internal 

, vi i) IMMEDIATE SOURCE: 
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(B) CLONE: p85 [alpha] N 
xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Gin Asp Ala Glu Trp Tyr Trp Gly Asp lie Ser Arg Glu Glu Val Asn 
1 5 

Glu Lye Leu Arg Asp Thr Ala Asp Gly Thr Phe Leu Val Arg Asp Ala 

20 25 
Ser Thr Lys Met His Gly Asp Tyr Thr Leu Thr Leu Arg Lys Gly Gly 

35 40 
Asn Asn Lys Leu He Lys lie Phe His Arg Asp Gly Lys Tyr Gly Phe 

Ser Asp Pro Leu Thr Phe Asn Ser Val Val Glu Leu lie Asn His Tyr 

Arg His Glu Ser Leu Ala Gin Tyr Asn Pro Lys Leu Asp Val Lys Leu 

3 RC 90 



Leu Tyr Pro 



... PCT/US94/10849 
WO 95/08629 119 



WHAT TS CLAIMEDJ S: 

! 1 A receptor recognition factor implicated in the transcnptional stimulation of 

2 genes in target cells in response to the binding of a specific polypeptide ligand to 

3 its cellular receptor on said target cell, said receptor recognition factor having the 

4 following characteristics: 

5 a) apparent direct interaction with the hgand-bound receptor and 

6 activation of one or more transcription factors capable of binding with a specific 

7 gene: 

8 b) an activity demonstrably unaffected by the presence or concentrate 

9 of second messengers; 

10 C ) direct interaction with tyrosine kinase domains; 
Hd) a perceived absence of interaction with G-proteins. 

12 e) an amino acid sequence selected from the group consisting of SEQ 

13 ID NO:8, SEQ ID NO: 10, and SEQ ID NO: 12. 



1 



2. The receptor recognition factor of Claim 1 labeled with a detectable label. 

1 3. The receptor recognition factor of Claim 2 wherein the label is selected 

2 from enzymes, chemicals which fluoresce and radioactive elements. 

1 4. An antibody to a receptor recognition factor, the factor to which said 

2 antibody is raised having the following characteristics: 
a) apparent direct interaction with the ligand-bound receptor and 

activation of one or more transcription factors capable of binding with a specific 



3 
4 

5 gene; 



6 



b) an activity demonstrably unaffected by the presence or concentration 



7 of second messengers; 

g C ) direct interaction with tyrosine kinase domains; 

9 d ) a perceived absence of interaction with G-proteins; and 



WO 95/08629 



120 



PCT/US94/10849 



10 e) an amino acid sequence selected from the group consisting of SEQ 

11 ID NO:8. SEQ ID NO:10, and SEQ ID NO:12. 



1 



The antibody of Claim 4 which is a polyclonal antibody. 



1 6. The 



antibody of Claim 4 which is a monoclonal antibody. 



1 



7. An immortal cell line that produces a monoclonal antibody according to 



2 Claim 6. 



1 8. 



The antibody of Claim 4 labeled with a detectable label. 



! 9 . The antibody of Claim 8 wherein the label is selected from enzymes, 
2 chemicals which fluoresce and radioactive elements. 



1 

2 
3 

.4 
5 
6 
7 



10 An iso.a,ed DNA sequence or degenera«e variant thereof, which encodes a 
receptor recognition factor, or a fragment thereof, selected from the group 

consisting of: 

(A) the DNA sequence of SEQ ID NO:7 (FIGURE 1); 

(B) the DNA sequence of SEQ ID NO:9 (FIGURE 2); 

(C) the DNA sequence of SEQ ID NO: 1 1 (FIGURE 3): 

(D) DNA sequences that hybridize to any of the foregoing DNA 



8 sequences under standard hybridization conditions; and 



(E) 



DNA sequences that code on expression for an amino acid sequence 



10 encoded by any of the foregoing DNA sequences. 



! n A recombinant DNA molecule comprising a DNA sequence or degenerate 

2 variant thereof, which encodes a receptor recognition factor, or a fragment 

3 thereof, selected from the group consisting of: 

(A) the DNA sequence of SEQ ID NO:7 (FIGURE 1); 

(B) the DNA sequence of SEQ ID NO:9 (FIGURE 2); 



4 
5 
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6 (C) the DNA sequence of SEQ ID NO: 1 1 (FIGURE 3); 

7 (D) DNA sequences that hybridize to any of the foregoing DNA 

8 sequences under standard hybridization conditions; and 

9 (E) DNA sequences that code on expression for an amino acid sequence 
10 encoded by any of the foregoing DNA sequences. 

The recombinant DNA molecule of either of Claims 10 or 1 1 . wherein said 



1 12 



2 



DNA sequence is operatively linked to an expression control sequence 

A probe capable of screening for the receptor recognition factor in alternate 
species prepared from the DNA sequence of Claim 10. 

t , A unicellular host transformed with a recombinant DNA molecule 
comprising a DNA sequence or degenerate variant thereof, which encodes a 
receptor recognition factor, or a fragment thereof, selected from the group 

4 consisting of: 

5 (A) the DNA sequence of SEQ ID NO:7 (FIGURE 1); 

6 (B) the DNA sequence of SEQ ID NO:9 (FIGURE 2); 

7 (C) the DNA sequence of SEQ ID NO:l 1 (FIGURE 3); 

8 (D) DNA sequences that hybridize to any of the foregoing DNA 

9 sequences under standard hybridization conditions; and 

10 (E) DNA sequences that code on expression for an amino acid sequence 

1 1 encoded by any of the foregoing DNA sequences; 
wherein said DNA sequence is operatively linked to an expression control 



1 13. 
2 

1 14 

2 
3 



10 



12 



13 sequence. 



A method for detecting the presence or activity of a receptor recognition 
factor sa,d receptor recognit.on factor having an amino acid sequence selected 

3 from the group consisting of SEQ ID NO:8. SEQ ID NO:10. and SEQ ID NO:12, 

4 wherein said receptor recognition factor is measured by: 



1 15 

2 
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5 A - 



„ conning a WotogW^ple from a mamma! in which ,he 
preserce or activity of said receptor recognition facror is suspected with a bindmg 
partner of sa,d receptor recognition facto, under conditions that allow bmd.ng of 
8 said receptor recognition factor ,o said bmding partner to occur; and 
I B. detecting whether b.nding has centred between preceptor 

10 recognition factor from said sample and the binding partner; 

12 receptor recognition factor in said sample. 

A method for detecting the presence and activity of a polypeptide ligarrf 
stated with a given invasive stimulus in mammals comprising 
presence or activ.ty of a receptor recognton factor according to the method o 
cL 15 wherein detection of the presence or activity of the receptor recogn., on 
fact0[ indicates the presence and activity of a po.ypept.de ligand associated w,dt a 
6 given invasive stimulus in mammals. 

,7 The method of Claim .6 where.n said .nvasive stimulus is selected from 
tlle gI „u P co„s,s,,ng o, v,ra. infection, protozoan infection, — mammahan 

cells, and toxins. 

, 18 A method for detecting the binding sites for a receptor recognuion factor, 
2 said receptor recognition factor having an amh» acid seouence selecte.^ from „,c 

Troup consisting of SEQ ID N0;8. SEQ D, N0;10, and SEQ ID NO;, 2 , wherem 
4 the bind.ng sites for said receptor recognition factor are measured by: 

A placing a labeled receptor recognition factor sample in 

7 receptor recognition factor are suspected; 

8 B. examining said biological sample in bmding stud.es for the 

9 presence of said labeled receptor recognition factor; 

10 wherem the presence of sa,d labeled recognitton factor ind,cates a bmdmg 

1 1 site for a receptor recognition factor. 



1 16 

2 
3 
4 
5 



1 17 
2 
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! 19. A method of testing the ability of a drug or other entity to modulate the 

2 activity of a receptor recognition factor which comprises 

3 A culturing a colony of test cells which has a receptor for the 

4 receptor recognition factor m a growth medium containing the receptor recognition 

5 factor; 

6 B adding the drug under test; and 

7 C. measuring the reactivity of said receptor recognition factor with the 

8 receptor on said colony of test cells, 

9 wherein said receptor recognition factor has an amino acid sequence selected from 

10 the group consisting of SEQ ID NO:8, SEQ ID NO:10, and SEQ ID NO:12. 

I 20 An assay system for screening drugs and other agents for ability to 
•» modulate the production of a receptor recognition factor, comprising: 

3 A . culturing an observable cellular test colony inoculated w.th a drug 

4 or agent; 

5 B . harvesting a supernatant from said cellular test colony; and 

6 C examining said supernatant for the presence of said receptor 

7 recognition factor where.n an increase or a decrease in a level of said receptor 

8 recognition factor indicates the ability of a drug to modulate the activity of said 

• 9 receptor recognition factor, said receptor recognition factor having an ammo acid 

10 sequence selected from the group consisting of SEQ ID NO:8, SEQ ID NO:10, 

II and SEQ ID NO: 12. 



A test kit for the demonstration of a receptor recognition factor in a 

eukaryotic cellular sample, comprising: 

A a predetermined amount of a detectably labelled specific binding 
partner of a receptor recognition factor, said receptor recognition factor having the 
an amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ 

6 ID NO: 10, and SEQ ID NO: 12; 

7 B. other reagents; and 

a C. directions for use of said kit. 



1 21. 
2 
3 
4 
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1 22. A test kit for demonstrate the presence of a receptor recognition factor in 

2 a eukaryotic cellular sample, comprising: 

A. a predetermined amount of a receptor recognition factor, said 
receptor recognition factor having the an ammo acid sequence selected from the 
group consisting of SEQ ID NO:8, SEQ ID NO:10, and SEQ ID NO:12; 

B. a predetermined amount of a specific binding partner of said 

receptor recognition factor; 

C. other reagents; and 

D. directions for use of said kit; 
wherein either said receptor recognition factor or said specific binding 



3 

4 

5 

6 
7 
8 
9 

10 



1 1 partner are detectably labelled. 



1 23 The test kit of Claim 2 1 or 22 wherein said labeled immunochemically 

2 reactive component is selected from the group consisting of polyclonal antibodies 

3 to the receptor recognition factor, monoclonal antibodies to the receptor 

4 recognition factor, fragments thereof, and mixtures thereof. 

. i 24 Use of a material selected from the group consisting of a receptor 

2 recognition factor, an agent capable of promoting the production and/or activity of 

3 said receptor recognition factor, an agent capable of mimicking the activity of said 

4 receptor recognition factor, an agent capable of inhibiting the production of said 

5 receptor recognition factor, and mixtures thereof, or a specific binding partner 

6 thereto, said receptor recognition factor having an amino acid sequence selected 

7 from the group consisting of SEQ ID NO:8. SEQ ID NO: 10, and SEQ ID NO: 12 

8 in the manufacture of a medicament for preventing and/or treating cellular 

9 debilitations, derangements and/or dysfunctions and/or other disease states in 
10 mammals. 

1 25. The use according to Claim 24 wherein said disease states include chronic 

2 viral hepatitis, hairy cell leukemia, and tumorous conditions. 
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! 26. A pharmaceutical composition for the treatment of cellular debilitation, 

2 derangement and/or dysfunction in mammals, comprising: 

3 A. a therapeutically effective amount of a material selected from 

4 the group consisting of a receptor recognition factor, an agent capable of 

5 promoting the production and/or activity of said receptor recognition factor, an 

6 agent capable of mimicking the activity of said receptor recognition factor, an 

7 agent capable of inhibiting the production of said receptor recognition factor, and 

8 mixtures thereof, or a specific binding partner thereto, said receptor recognition 

9 factor having an amino acid sequence selected from the group consisting of SEQ 

10 ID NO:8, SEQ ID NO: 10, and SEQ ID NO:12; and 

b. a pharmaceutical^ acceptable carrier. 

1 27. A method of determining the interferon-related pharmacological activity of 

2 a compound comprising: 

3 administering the compound to non-human a mammal; 

4 determining the level of phosphorylated ISGF3 proteins present, wherein 

5 said phosphorylated ISGF3 proteins are proteins having an amino acid sequence 

6 selected from the group consisting of SEQ ID NO:8, SEQ ID NO.10, and SEQ ID 
'7 NO: 12; and 

8 comparing the level of ISGF3 protein-phosphate to a standard. 

An antisense nucleic acid against a receptor recognition factor mRNA 
comprising a nucleic acid sequence hybridizing to said mRNA, wherein said 
receptor recognition factor has an amino acid sequence selected from the group 



1 28 
2 
3 



4 consisting of SEQ ID NO:8, SEQ ID NO:10, and SEQ ID NO:12 
1 29. The ant.sense nucleic acid of Claim 28 wh.ch is RNA or DNA. 



1 30 A recombinant DNA molecule having a DNA sequence which, on 

, transcripts, produces an antisense ribonucle.c aod agams, a receptor recognmon 

3 factor mRNA, said antisense ribonucleic acid compnsmg an nucleic ac,d sequence 
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capable of hybridizing to said mRNA, where, said receptor — " ™ 
an am.no acid sequence selected fro, the group costing of SEQ ID N0.8, SEQ 



6 IDNO:10,andSEQIDNO:12. 



1 31. 



A receptor recognition factor-producing cell line transfected with the 



recombinant DNA molecule of Claim 30. 



, 32 A method for creating a cel. line which exhibits reduced expression of a 

2 receptor recognition factor, comprising transfecting a recognition factor-producing 

3 cell line with a recombinant DNA molecule of claim 30. 

« A ribozyme that cleaves receptor recognition factor mRNA, wherein said 
receptor recognition factor has an ammo acid sequence selected from the group 
consisting of SEQ ID N0:8, SEQ ID NO:10. and SEQ ID NO:12. 

! 34. A recombinant DNA molecule having a DNA sequence which, upon 
2 transcription, produces the ribozyme of claim 33. 

! 35. A receptor recognition factor-producmg cell line transfected with the 



1 33 
2 
3 



2 



recombinant DNA molecule of claim 34. 



1 36 
2 



J0 A method for creating a cell line which exhibits reduced expression of a 
receptor recognition factor, comprising transfecting a recognition factor-produc.n, 
cell line with the recombinant DNA molecule of claim 33. 
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FIG. IA 

1 MSQWFELQQL DSKFLEQVHQ LYDDSFPMEI RQYLAQWLEK QDWEHAAYDV 
51 SFATIRFHDL LSQLDDQYSR FSLENNFLLQ IINIRKSKRNL QDNFQEDPVQ 
101 MSMIIYNCLK EERKILENAQ RFNQAQEGNI QNTVMLDKQK ELDSKVRNVK 
151 DQVMCIEQEI KTLEELQDEY DFKCKTSQNR EGEANGVAKS DQKQEQLLLH 
201 KMFLMLDNKR KEIIHKIREL LNSIELTQNT LINDELVEWK RRQQSACIGG 
251 PPNACLDQLQ TWFT I VAETL QQIRQQLKKL EELEQKFTYE PDPITKNKQV 
301 LSDRTFLLFQ QLIQSSFWE RQPCMPTHPQ RPLVLKTGVQ FTVKSRLLVK 
351 LQESNLLTKV KCIIFDKDVNE KNTVKGFRKF NILGTIITKVM NMEESTNGSL 
401 AAELRIILQLK EQKNAGNRTN EGPLIVTEEL HSLSFETQLC QPGLVIDLET 
451 TSLPVWISN VSQLPSGWAS ILWYNMLVTE PRNLSFFLNP PCAWWSQLSE 
501 VLSWQFS5VT KRGLNADQLS MUJEKLLGPN AGPDGLIPWT RFCKENINDK 
551 KFSFWPWIDT ILELIKNDLL CLWNDGCIMG FISKERERAL LKDQQPGTFL 
601 LRFSESSREG AITFTWVERS QNGGEPDFHA VEPYTKKELS AVTFPDIIRN 
651 YKVMAAENIP ENPLKYLYPN IDKDHAFGKY YSRPKEAPEP MELDDPKRTG 
701 YIKTELISVS EVHPSRLQTT DNLLPMSPEE FDEMSRIVGP EFDSMMSTV 
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FIG. IB 

1 caggatgtca cagtggttcg agcttcagca gctggactcc aagttcctgg 
51 agcaggtcca ccagctgtac gatgacagtt tccccatgga aatcagacag 
101 tacctggccc agtggctgga aaagcaagac tgggagcacg ctgcctatga 
151 tgtctcgttt gcgaccatcc gcttccatga cctcctctca cagctggacg 
201 accagtacag ccgcttttct ctggagaata atttcttgtt gcagcacaac 
251 atacggaaaa gcaagcgtaa tctccaggat aacttccaag aagatcccgt 
301 acagatgtcc atgatcatct acaactgtct gaaggaagaa aggaagattt 
351 tggaaaatgc ccaaagattt aatcaggccc aggagggaaa tattcagaac 
401 actgtgatgt tagataaaca gaaggagctg gacagtaaag tcagaaatgt 
451 gaaggatcaa gtcatgtgca tagagcagga aatcaagacc ctagaagaat 
501 tacaagatga atatgacttt aaatgcaaaa cctctcagaa cagagaaggt 
551 gaagccaatg gtgtggcgaa gagcgaccaa aaacaggaac agctgctgct 
601 ccacaagatg tttttaatgc ttgacaataa gagaaaggag ataattcaca 
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FIG. IB I 

651 aaatcagaga gttgctgaat tccatcgagc tcactcagaa cactctgatt 

701 aatgacgagc tcgtggagtg gaagcgaagg cagcagagcg cctgcatcgg 

751 gggaccgccc aacgcctgcc tggatcagct gcaaacgtgg ttcaccattg 

801 ttgcagagac cctgcagcag atccgtcagc agcttaaaaa gctggaggag 

051 ttggaacaga aattcaccta tgagcccgac cctattacaa aaaacaagca 

901 ggtgttgtca gatcgaacct tcctcctctt ccagcagctc attcagagct 

951 ccttcgtggt agaacgacag ccgtgcatgc ccactcaccc gcagaggccc 

1001 ctggtcttga agactggggt acagttcact gtcaagtcga gactgttggt 

1051 gaaattgcaa gagtcgaatc tattaacgaa agtgaaatgt cactttgaca 

1101 aagatgtgaa cgagaaaaac acagttaaag gatttcggaa gttcaacatc 

1151 ttgggtacgc acacaaaagt gatgaacatg gaagaatcca ccaacggaag 

1201 tctggcagct gagctccgac acctgcaact gaaggaacag aaaaacgctg 

1251 ggaacagaac taatgagggg cctctcattg tcaccgaaga acttcactct 

1301 cttagctttg aaacccagtt gtgccagcca ggcttggtga ttgacctgga 

1351 gaccacctct cttcctgtcg tggtgatctc caacgtcagc cagctcccca 
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FIG. IC 

1401 gtggctgggc gtctatcctg tggtacaaca tgctggtgac agagcccagg 
1451 aatctctcct tcttcctgaa ccccccgtgc gcgtggtggt cccagctctc 
1501 agaggtgttg agttggcagt tttcatcagt caccaagaga ggtctgaacg 
1551 cagaccagct gagcatgctg ggagagaagc tgctgggccc taatgctggc 
1601 cctgatggtc ttattccatg gacaaggttt tgtaaggaaa atattaatga 
1651 taaaaatttc tccttctggc cttggattga caccatccta gagctcatta 
1"701 agaacgacct gctgtgcctc tggaatgatg ggtgcattat gggcttcatc 
1751 agcaaggagc gagaacgcgc tctgctcaag gaccagcagc cagggacgtt 
1801 cctgcttaga ttcagtgaga gctcccggga aggggccatc acattcacat 
1851 gggtggaacg gtcccagaac ggaggtgaac ctgacttcca tgccgtggag 
1901 ccctacacga aaaaagaact ttcagctgtt actttcccag atattattcg 
1951 caactacaaa gtcatggctg ccgagaacat accagagaat cccctgaagt 
2001 atctgtaccc caatattgac aaagaccacg cctttgggaa gtattattcc 
2051 agaccaaagg aagcaccaga accgatggag cttgacgacc ctaagcgaac 
2101 tggatacatc aagactgagt tgatttctgt gtctgaagtc cacccttcta 
2151 gacttcagac cacagacaac ctgcttccca tgtctccaga ggagtttgat 
2201 gagatgtccc ggatagtggg ccccgaattt gacagtatga tgagcacagt 
2251 ataaacacga atttctctct ggcgaca 
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FIG. 2A 

1 MSQWNQVQQL EIKFLEQVDQ FYDDNFPMEI RHLLAQWIET QDWEVASNNE 
51 TMATILLQNL LIQLDEQLGR VSKEKNLLLI HNLKRIRKVL QGKFI1GNPMH 
101 VAWISNCLR EERRILAAAN MPIQGPLEKS LQ3SSVSERQ RNVEHKVSAI 
151 KNSVQMTEQD TKYLEDLQDE FDYRYKTIQT MDQGDKNSIL VNQEVLTLLQ 
201 EMLNSLDFKR KEALSKMTQI VNETDLLMNS MLLEELQDWK KRIJRIACIGG 
251 PLHNGLDQLQ NCFTLLAESL FQLRQQLEKL QEQSTKMTYE GDPIPAQRAH 
301 LLERATFLIY NLFKNSFWE RHACMPTHPQ RPMVLKTLIQ FTVKLRLLIK 
351 LPELNYQVKV KASIDKNVST LSNRRFVLCG THVKAMSSEE S5UGSLSVEL 
401 DIATQGDEVQ YWSKGNEGCH MVTEELHSIT FETQICLYGL TINLETSSLP 
451 WMISNVSQL PNAWASIIWY NVSTNDSQNL VFFNNPPSVT LGQLLEVMSW 
501 QFSSYVGRGL NSEQLNMLAE KLTVQSNYND GFILTWAKFCK EI1LPGKTFTF 
551 WTWLEAILDL IKKHILPLWI DGYIMGFVSK EKERLLLKDK MPGTFLLRFS 
G01 ESHLGGITFT WVDQ5ENGEV RF1ISVEPYNK GRL5ALAFAD ILRDYKVIMA 
651 ENIPENPLKY LYPDIPKDKA FGKIIYSSQPC EVSRPTERGD KGYVPSVFIP 
701 ISTIRSDSTE PQSPSDLLPM SPSAYAVLRE NLSPTTIETA MNSPYSAE 
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FIG. 2B 

1 tgccactacc tggacggaga gagagagagc agcatgtctc agtggaatca 
51 agtccaacaa ttagaaatca agtttttgga gcaagtagat cagttctatg 
101 atgacaactt tcctatggaa atccggcatc tgctagctca gtggattgag 
151 actcaagact gggaagtagc ttctaacaat gaaactatgg caacaattct 
201 gcttcaaaac ttactaatac aattggatga acagttgggg cgggtttcca 
251 aagaaaaaaa tctgctattg attcacaatc taaagagaat tagaaaagtt 
301 cttcagggca agtttcatgg aaatccaatg catgtagctg tggtaatttc 
351 aaattgctta agggaagaga ggagaatatt ggctgcagcc aacatgccta 
401 tccagggacc tctggagaaa tccttacaga gttcttcagt ttctgaaaga 
451 caaaggaatg tggaacacaa agtgtctgcc attaaaaaca gtgtgcagat 
501 gacagaacaa gataccaaat acttagaaga cctgcaagat gagtttgact 
551 acaggtataa aacaattcag acaatggatc agggtgacaa aaacagtatc 
601 ctggtgaacc aggaagtttt gacactgctg caagaaatgc ttaatagtct 
651 ggacttcaag agaaaggaag cactcagtaa gatgacgcag atagtgaacg 
701 agacagacct gctcatgaac agcatgcttc tagaagagct gcaggactgg 
751 aaaaagcggc acaggattgc ctgcattggt ggcccgctcc acaatgggct 
801 ggaccagctt cagaactgct ttaccctacL ggcagagagt cttttccaac 

B5] tcagacagca actggagaaa ctacaggagc aatctactaa aatgacctat 
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FIG. 2C 

901 gaaggggatc ccatccctgc tcaaagagca cacctcctgg aaagagctac 

951 cttcctgatc tacaaccttt tcaagaactc atttgtggtc gagcgacacg 

1001 catgcatgcc aacgcaccct cagaggccga tggtacttaa aaccctcatt 

1051 cagttcactg taaaactgag attactaata aaattgccgg aactaaacta 

1101 tcaggtgaaa gtaaaggcgt ccattgacaa gaatgtttca actctaagca 

1151 atagaagatt tgtgctttgt ggaactcacg tcaaagctat gtccagtgag 

1201 gaatcttcca atgggagcct ctcagtggag ttagacattg caacccaagg 

1251 agatgaagtg cagtactgga gtaaaggaaa cgagggctgc cacatggtga 

1301 cagaggagtt gcattccata acctttgaga cccagatctg cctctatggc 

1351 ctcaccatta acctagagac cagctcatta cctgtcgtga tgatttctaa 

1401 tgtcagccaa ctacctaatg catgggcatc catcatttgg tacaatgtat 

1451 caactaacga ctcccagaac ttggttttct ttaataaccc tccatctgtc 

1501 actttgggcc aactcctgga agtgatgagc tggcaatttt catcctatgt 

1551 cggtcgtggc cttaattcag agcagctcaa catgctggca gagaagctca 

1601 cagttcagtc taactacaat gatggtcacc tcacctgggc caagttctgc 

1651 aaggaacatt tgcctggcaa aacatttacc ttctggactt ggcttgaagc 

1701 aatattggac ctaattaaaa aacatattct tcccctctgg attgatgggt 

1151 acatcatggg atttgttagt aaagagaagg aacggcttct gctcaaagat 

1001 aaaatgcctg ggacattttt gttaagattc agtgagagcc aLcLLggagg 
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FIG. 2D 

1851 gataaccttc acctgggtgg accaatctga aaatggagaa gtgagattcc 
1901 actctgtaga accctacaac aaagggagac tgtcggctct ggccttcgct 
1951 gacatcctgc gagactacaa ggttatcatg gctgaaaaca tccctgaaaa 
2001 ccctctgaag tacctctacc ctgacattcc caaagacaaa gcctttggca 
2051 aacactacag ctcccagccg tgcgaagtct caagaccaac cgaacgggga 
2101 gacaagggtt acgtcccctc tgtttttatc cccatttcaa caatccgaag 
2151 cgattccacg gagccacaat ctccttcaga ccttctcccc atgtctccaa 
2201 gtgcatatgc tgtgctgaga gaaaacctga gcccaacgac aattgaaact 
2251 gcaatgaatt ccccatattc tgctgaatga cggtgcaaac ggacacttta 
2301 aagaaggaag cagatgaaac tggagagtgt tctttaccat agatcacaat 
2351 ttatttcttc ggctttgtaa atacc 
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FIG. 3A 

1 MAQWNQLQQL DTRYLKQLHQ LYSDTFPMEL RQFLAPWIES QDWAYAASKE 
51 SHATLVFHNL LGEIDQQYSR FLQESNVLYQ IINLRRIKQFL QSRYLEKPME 
101 IARIVARCLW EESRLLQTM TMQQGGQAN HPTAAWTEK QQMLEQHLQD 
151 VRKRVQDLEQ KMKWENLQD DFDFNYKTLK SQGDMQDLNG NNQSVTRQKM 
201 QQLEQMLTAL DQMRR5IVSE LAGLLSAMEY VQKTLTDEEL ADWKRRPEIA 
251 CIGGPPNICL DRLENWITSL AESQLQTRQQ IKKLEELQQK VSYKGDPIVQ 
301 HRPMLEERIV ELFRNLMKSA FWERQPCMP MHPDRPLVIK TGVQFTTKVR 
351 LLVKFPELNY QLKIKVCIDK DSGDVAALRG SRKFNILGTN TKVMNMEESN 
401 NGSLSAEFKH LTLREQRCGN GGRANCDASL IVTEELULIT FETEVYHQGL 
451 KIDLETHSLP VWTSNICQM PNAWASILWY NMLTNNPKNV NFFTKPPIGT 
501 WDQVAEVLSW QFSSTTKRGL SIEQLTTLAE KLLGPGVNYS GCQITWAKFC 
551 KENMAGKGFS FWVWLDNIID LVKKYILALW NEGYIMGFIS KERERAILST 
601 KPPGTFLLRF SESSKEGGVT FTWVEKDISG KTQIQSVEPY TKQQLNNMSF 
651 AEIIMGYKIM DATNILVSPL VYLYPDIPKE EAFGKYCRPE SQEIIPEADPG 
701 SAAPYLKTKF ICVTPTTCSN TIDLPMSPRT LDSLWQFGNN GEGAEPSAGG 
751 QFESLTFDMD LTSECATSPM 
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1 gccgcgacca gccaggccgg ccagtcgggc tcagcccgga gacagtcgag 
51 acccctgact gcagcaggat ggctcagtgg aaccagctgc agcagctgga 
101 cacacgctac ctgaagcagc tgcaccagct gtacagcgac acgttcccca 
151 tggagctgcg gcagttcctg gcaccttgga ttgagagtca agactgggca 
201 tatgcagcca gcaaagagtc acatgccacg ttggtgLttc ataatctctt 
251 gggtgaaatt gaccagcaat atagccgatt cctgcaagag tccaatgtcc 
301 tctatcagca caaccttcga agaatcaagc agtttctgca gagcaggtat 
351 cttgagaagc caatggaaat tgcccggatc gtggcccgaL gcctgtggga 
401 agagtctcgc ctcctccaga cggcagccac ggcagcccag caagggggcc 
451 aggccaacca cccaacagcc gccgtagtga cagagaagca gcagatgttg 
501 gagcagcatc ttcaggatgt ccggaagcga gtgcaggatc tagaacagaa 
551 aatgaaggtg gtggagaacc tccaggacga ctttgatttc aactacaaaa 
601 ccctcaagag ccaaggagac atgcaggatc Lgaatggaaa caaccagtct 
651 gtgaccagac agaagatgca gcagctggaa cagatgctca cagccctgga 
701 ccagatgcgg agaagcattg tgagtgagct ggcggggcLc ttgtcagcaa 
751 tggagtacgt gcagaagaca ctgactgatg aagagctggc tgactggaag 
801 aggcggccag agatcgcgtg catcggaggc cctcccaaca tctgcctgga 

851 ccgtctggaa aactggataa cttcattagc agaatctcaa cttcagaccc 
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FIG. 3C 

901 gccaacaaat taagaaactg gaggagctgc agcagaaagt gtcctacaag 
951 ggcgacccta tcgtgcagca ccggcccatg ctggaggaga ggatcgtgga 
1001 gctgttcaga aacttaatga agagtgcctt cgtggtggag cggcagccct 
1051 gcatgcccat gcacccggac cggcccttag tcatcaagac tggtgtccag 
1101 tttaccacga aagtcaggtt gctggtcaaa tttcctgagt tgaattatca 
1151 gcttaaaatt aaagtgtgca ttgataaaga ctctggggat gttgctgccc 
1201 tcagagggtc tcggaaattt aacattctgg gcacgaacac aaaagtgatg 
1251 aacatggagg agtctaacaa cggcagcctg tctgcagagt tcaagcacct 
1301 gacccttagg gagcagagat gtgggaatgg aggccgtgcc aattgtgatg 
1351 cctccttgat cgtgactgag gagctgcacc tgatcacctt cgagactgag 
1401 gtgtaccacc aaggcctcaa gattgaccta gagacccacu ccttgccagt 
1451 tgtggtgatc tccaacatct gtcagatgcc aaatgcttgg gcatcaatcc 
1501 tgtggtataa catgctgacc aataacccca agaacgtgaa cttcttcact 
1551 aagccgccaa ttggaacctg ggaccaagtg gccgaggtgc tcagctggca 
1601 gttctcgtcc accaccaagc gagggctgag catcgagcag ctgacaacgc 
1651 tggctgagaa gctcctaggg cctggtgtga actactcagg gtgtcagatc 
1701 acatgggcta aattctgcaa agaaaacatg gctggcaagg gcttctcctt 
1751 ctgggtctgg ctagacaata tcatcgacct tgtqaaaaag tatatcttgg 
1801 ccctttggaa tgaagggtac atcatgggtt tcatcagcaa ggagcgggag 
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FIG. 3D 

1851 cgggccatcc taagcacaaa gcccccgggc accttcctac tgcgcttcag 
1901 cgagagcagc aaagaaggag gggtcacttt cacttgggtg gaaaaggaca 
1951 tcagtggcaa gacccagatc cagtctgtag agccatacac caagcagcag 
2001 ctgaacaaca tgtcatttgc tgaaatcatc atgggctata agatcatgga 
2051 tgcgaccaac atcctggtgt ctccacttgt ctacctctac cccgacattc 
2101 ccaaggagga ggcatttgga aagtactgta ggcccgagag ccaggagcac 
2151 cccgaagccg acccaggtag tgctgccccg tacctgaaga ccaagttcat 
2201 ctgtgtgaca ccaacgacct gcagcaatac cattgacctg ccgatgtccc 
2251 cccgcacttt agattcattg atgcagtttg gaaataacgg tgaaggtgct 
2301 gagccctcag caggagggca gtttgagtcg ctcacgtttg acatggatct 
2351 gacctcggag tgtgctacct cccccatgtg aggagctgaa accagaagct 
2401 gcagagacgt gacttgagac acctgccccg tgctccaccc ctaagcagcc 
2451 gaaccccata tcgtctgaaa ctcctaactt tgtggttcca gatttttttt 
2501 tttaatttcc tacttctgct atctttgggc aatctgggca ctttttaaaa 
2551 gagagaaatg agtgagtgtg ggtgataaac tgttatgtaa agaggagaga 
2601 cctctgagtc tggggatggg gctgagagca gaagggaggc aaaggggaac 
2651 acctcctgtc ctgcccgcct gccctccttt ttcagcagct cgggggttgg 
2701 ttgttagaca agtgcctcct ggtgcccatg gctacctgtt gccccactct 

2751 gtgagctgat accccattct gggaactcct ggctctgcac tttcaacctt 
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